Haystack: Highlight only one field

Haystack: Highlight only one field - django

I'm using Haystack 2.3.0, and I have a search index like:
class MyModelIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
name = indexes.EdgeNgramField(model_attr='name', boost=1.250)
short_description = indexes.CharField(model_attr='short_description', null=True, boost=1.125)
description = indexes.CharField(model_attr='description', null=True, boost=1.125)
detail_description = indexes.CharField(model_attr='detail_description', null=True)
def get_model(self):
return MyModel
I'd like to highlight only the field detail_description. I've read in the official documentation this example:
sqs = SearchQuerySet().filter(content='foo').highlight()
result = sqs[0]
result.highlighted['text'][0]
But when I try to do that I don't get the same result. In the example above, result.highlighted seems to be a dictionary where you can access the highlight of each field:
result.highlighted['text'][0]
But in my example, when I do the same, result.highlighted is not a dictionary, it is a list, and only return the highlight of the text field.
How could I set the highlight to a concrete field ?

If the number_of_fragments value is set to 0 then no fragments are
produced, instead the whole content of the field is returned, and of
course it is highlighted. This can be very handy if short texts (like
document title or address) need to be highlighted but no fragmentation
is required. Note that fragment_size is ignored in this case.
From
LINK - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html
You need to see how to change this parameter in haystack.

My temporary solution right now is to do a for loop and add the highlight manually to the field, like this:
for result in sqs:
highlight = Highlighter(my_query) # my_query has the word(s) of the query
result.detail_description = highlight.highlight(result.detail_description)

A bit late here, but if you want to pass extra params to highlight, you need to pass a dict of whatever params elasticsearch would want for the highlight function, like so:
# Solr example since I'm not familiar with ES
sqs = SearchQuerySet().filter(content='foo').highlight(**{'hl.fl': 'short_description')

Related

Filtering django queryset by text applying the diacritics

I am trying to filter a queryset in python by a text
the model is:
models.Offer
id = pk
description = text
I am trying to filter it like:
someText = self.shave_marks(someText)
offers = offers.filter(description__icontains=someText)
Where the shave_marks is replacing the special characters like: ç will become c.
The text in the database (in the description field) also has special characters, what I need is to "shave" the description text first then do the filtering.
Any help, thank you very much!!!

How about this ?
offers = [(x, x.description)) for x in offers.objects.all()]
required_offers = []
for key, value in offers:
if someText in shave_marks(value):
required_offers.append(key)

what you can do is, create a custom field extending charfield. have that field override method get_prep_value
I could not find a concrete example, but in theory this should work.
class SpecialField(models.CharField):
def get_prep_value(self, value):
return shave_marks(shave_marks)

Django - filter queryset disregarding spaces

I have a model that contains phone numbers external_number, stored as a char field:
models.py
class PhoneRecord(models.Model):
def __unicode__(self):
return "Call to %s (%s)" % (self.external_number, self.call_date.strftime("%c"))
INBOUND = "I"
OUTBOUND = "O"
DIRECTION_CHOICES = (
(INBOUND, "Inbound"),
(OUTBOUND, "Outbound"),
)
external_number = models.CharField(max_length=20)
call_date = models.DateTimeField()
external_id = models.CharField(max_length=20)
call_duration = models.TimeField()
call_direction = models.CharField(max_length=1, choices=DIRECTION_CHOICES, default=INBOUND)
call = models.FileField(upload_to='calls/%Y/%m/%d')
The form is cleaning and storing the data using the UKPhoneNumberField from https://github.com/g1smd/django-localflavor-UK-forms/blob/master/django/localflavor/uk/forms.py
This means that the number is stored in the database in the format 01234 567890 i.e. with a space in.
I have created a filter using django-filters which works well, when searching for partial phone number except that it doesn't filter correctly when the search term doesn't include the space. i.e.
search for 01234 returns the record for the example above
search for 567890 returns the record for the example above
search for 01234 567890 returns the record for the example above
search for 01234567890 does not return the record for the example above
Now, I could subject the form on the filter to the same restrictions (i.e. UKPhoneNumberField as the input screen, but that then takes away the ability to search for partial phone numbers.
I have also explored the possibility of using something like django-phonenumber-field that will control both the model and the form, but the validation provided by UKPhoneNumberField allows me to validate based on the type of number entered (e.g. mobile or geographical).
What I would ideally like is either
My filter to ignore spaces that are either input by the user in their search query, or any spaces that are in the stored value. Is this even possible?
Apply the validation provided by UKPhoneNumberField to another field type without having to analyse and re-write all the regular expressions provided.
Some other UK Phone number validation I have not found yet!

You could setup the filter to do something like this:
phone_number = "01234 567890"
parts = phone_number.split(" ")
PhoneRecord.objects.filter(
external_number__startswith="{} ".format(parts[0]),
external_number__endswith=" {}".format(parts[1]),
)
That way the filter is looking for the first half of the number with the space and then the second half of the number with the space as well. The only records that would be returned would be ones that had the value of "01234 567890"

I ended up adding a custom filter with a field_class = UKPhoneFilter. This means I can't search on partial numbers, but that was removed as a requirement from the project.

You should think about normalizing the data before you save them into your DB.
You could just use django-phonenumber-field which does just that.
In regarding your problem you can always use django's regex field postfix to query over regular expression.
e.g. MyModel.objects.filter(myfiel__regex=r'[a-Z]+')

Advanced Django ORM query with operator.or_ fails with None Values

I'm trying to query the database and exclude some rows that contain one of some certain stings.
My simplified model looks like this:
class Message(models.model)
text = models.TextField(blank=True, null=True)
My query looks like follows:
import operator
ignored_patterns = ['<ignore>', ]
messages = Message.objects.exclude(
reduce(operator.or_, (Q(text__contains=pattern) for pattern
in ignored_patterns))
)
The problem i have is, that somehow Messages that have self.text = Noneare excluded too.
I'm thankful for every hint.

You can use and condition in exclude:
exclude = reduce(operator.or_, (Q(text__contains=pattern) for pattern
in ignored_patterns))
nulltext = Q(text__isnull = False)
messages = Message.objects.exclude( nulltext & exclude )
Also, read about use of null in strings:
Avoid using null on string-based fields such as CharField and
TextField because empty string values will always be stored as empty
strings, not as NULL.

Sorting by many to many relationship

In the simplified version of my problem I have a model Document that has a manay to many relationship to Tag. I would like to have a query, that given a list of tags will sort the Documents in the order they match the tags i.e. the documents that match more tags will be displayed first and the documents that match fewer tags be displayed later. I know how to do this with a large plain SQL query but i'm having difficulties getting it to work with querysets. Anyone could help?
class Document(model.Model):
title = CharField(max_length = 20)
content = TextField()
class Tag(model.Model):
display_name = CharField(max_length = 10)
documents = ManyToManyField(Document, related_name = "tags")
I would like to do something like the following:
documents = Documents.objects.all().order_by(count(tags__in = ["java", "python"]))
and get first the documents that match both "java" and "python", then the documents that match only one of them and finally the documents that don't match any.
Thanks in advance for your help.

Have a look at this : How to sort by annotated Count() in a related model in Django
Some doc :https://docs.djangoproject.com/en/1.6/topics/db/aggregation/#order-by

Django Order_by blank

I have a simple model
title = models.CharField(max_length=250)
url = models.CharField(max_length=250)
title_french = models.CharField(max_length=250)
I want to order it via title_french, however when it orders in A-Z in this way, all the blank values are at the top. In the case ot blank values I display the English title.
So I get A-Z for the French titles, but at the top there is a load of English title unordered.
Any advice?

For your case, I think you should do the sorting in your python code (currently, as it is, the sorting is made in the database). It is not possible, imho, to do what you want in the db, at least without writing some sql by hand.
So the idea would be to do something like this in your view :
your_objects = list(YourObject.objects.filter(....))
your_objects.sort(key=lambda ob: ob.title_french if ob.title_french else ob.title)
As long as you sort small lists, this should not be a too problematic performance issue.

have you tried ordering by multiple fields (doc):
ordering = ('title_french', 'title')

specify both the columns, title_french and title in order_by
queryset.order_by('title_french', 'title')
title_french will be given first preference and if there are two entries with the same title_french then those two entries will be sorted by their title

Here is a way to order blank value last while only using the ORM:
from django.db.models import Case, When, Value
...
title_french_blank_last = Case(
When(title_french="", then=Value(1)),
default=Value(0)
)
...
queryset.order_by(title_french_blank_last, "title_french", "title")

Django has the option to order nulls_first and nulls_last, for details see the docs.
In your case it would be something like this (not tested):
MyModel.objects.order_by(Coalesce('title_french', 'title').asc(nulls_last=True))
You would still have to do some logic in Python to display the title when the french title is None.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Haystack: Highlight only one field - django

My temporary solution right now is to do a for loop and add the highlight manually to the field, like this: for result in sqs: highlight = Highlighter(my_query) # my_query has the word(s) of the query result.detail_description = highlight.highlight(result.detail_description)

Related

Filtering django queryset by text applying the diacritics

Django - filter queryset disregarding spaces

Advanced Django ORM query with operator.or_ fails with None Values

Sorting by many to many relationship

Django Order_by blank

Categories

Resources