How do you access/configure summaries/snippets in Django Haystack - django

I'm working on getting django-haystack set up on my site, and am trying to have snippets in my search results roughly like so:
Title of result one about Wikis ...this special thing about wiki values is that...I always use a wiki when I walk...snippet value three talks about wikis too...and here's another snippet value
about wikis.
I know there's a template tag that uses Haystack code to do the the highlighting, but the snippets it generates are pretty limited:
they always start with the query word
there's only one snippet value
they don't support asterisk queries
and other stuff?
Is there a way to use the Solr backend to generate proper snippets as shown above?

Bottom line is that the Solr highlighting can't really be used by Haystack in a flexible way. I spoke to the main developer for Haystack on IRC, and he said basically, if I want to have the kind of highlighting I'm looking for, the only way to get it is to extend the Solr backend that Haystack uses.
I dabbled in that for about half a day, but couldn't get Haystack to recognize my custom back end. Haystack has some magic backend loading code that just wasn't working with me.
Consequently, I've switched over to sunburnt, which provides a lighter-weight and more extensible wrapper around Solr. I'm hoping it will fare better.

from haystack.utils import Highlighter
my_text = 'This is a sample block that would be more meaningful in real life.'
my_query = 'block meaningful'
highlight = Highlighter(my_query)
highlight.highlight(my_text)
http://docs.haystacksearch.org/dev/highlighting.html

Related

Querying Django Postgres for Fuzzy FTS + Typeahead

I'm using PostgreSQL 12.3 with Django 3.0.8. I'm trying to implement a typeahead functionality. I tried using Trigram similarity, but that didn't work too well since, if I were to have a text saying "chair," I would have to type "cha" before I get results.
I'm using corejavascript/twitter's typeahead JS library. Their first example (The Basics) allows you to type in a single letter and get results instantly, some of which are not even the prefix of the word.
How would I do that? You can view my current functionality here: https://github.com/Donate-Anything/Donate-Anything/blob/76348e9362d386d3d6375b9a75d47d5765960992/donate_anything/item/views.py#L21-L30
I thought that I should use to_tsvector with a SearchVectorField, but when I implemented that (with a query like this:
queryset = (
Item.objects.defer("is_appropriate")
.filter(name_search=str(query))
)
then I don't even see the results until I type in the full word. What would my implementation look like then? (Article for implementing the triggers for SearchVectorField)

Django Haystack similarity search

I'm a Django newbie doing a primitive website. I installed haystack and Whoosh as its search engine cause it was the simplest thing to do. It works fine, but there is a problem and I don't know how to Google it. I have some categories on my site and I have indexed their names to search. So, when a user enters "Computing" it finds the computing category and links to it. But there is a problem. If a user enters "Comp" into search field, it doesn't find "Computing" at all. Is this something that can be configured and how?
EDIT:
What else have I tried? Installing haystack 2.0, following this tutorial, installing solr instead of whoosh, trying Ngram fields, rebuilding indexes 10 times, rewriting search_indexes.py. Everything. Doesn't work. If I type in Comp, it doesn't find Computing. Is there anything else I could do? I have noticed that in the tutorial above, everything works like a charm instantly.
When you do the usual:
SearchQuerySet().filter(title='Computing')
in Haystack 1.x, it filters on everything exactly matching 'Computing'.
You can change that behaviour by using Haystack's Field Lookups, for example, using 'contains' will filter on anything containing the given string (Computing, Utingcomp, Comp):
SearchQuerySet().filter(title__contains='Comp')
In Haystack 2.x, the default filter is 'contains', so it should behave as you would expect it to "out-of-the-box"
Check out the documentation on autocomplete. You need to setup your indices to support Ngram's, but this should be exactly what you need.
from haystack.query import SearchQuerySet
SearchQuerySet().autocomplete(content_auto='old')
# Result match things like 'goldfish', 'cuckold' & 'older'.
So, if I'm understanding, what you're looking for is the equivalent of 'LIKE' in SQL.
The problem is search engines that back Haystack aren't like an RDBMS.
The low level implementation of this filter will involve using wildcard characters but most of the Haystack backends don't support a leading wildcard, something required for an icontains/endswith filter. However, since most backends support trailing wildcards, Haystack 2.x includes a startswith filter. The only case this doesn't handle is searching for the end of a word, which doesn't look to be possible.
So, if you have indexed:
"Look at our great discounts in Computer section"
Then the following Haystack query DO match:
SearchQuerySet().filter(title__startswith='comp')
# match!
Notice the difference between Django vs. Haystack startswith filters. Django startswith will match at the beginning of the complete sentence (i.e. a CharField), but the Haystack one will match at the beginning of a token (i.e. each word in a complete sentence).
Hope it helps!

Django admin URL query string "OR". Is it possible?

I love being able to write quick and dirty query strings right into the URL of the Django admin. Like: /admin/myapp/mymodel/?pub_date__year=2011
AND statements are just as easy: /admin/myapp/mymodel/?pub_date__year=2011&author=Jim
I'm wondering if it's possible to issue an 'OR' statement via the URL. Anyone heard of such functionality?
Django < 1.4 doesn't support OR queries. Sometimes it is possible to translate OR queries to __in - queries which are supported (they are equivalent to OR queries but only for single field values).
You can also upgrade to django development version: it has more versatile list_filter implementation (see https://docs.djangoproject.com/en/dev//ref/contrib/admin/#django.contrib.admin.ModelAdmin.list_filter ) which can be used for providing advanced admin filters (including OR-queries).
The & is not a logical AND, even though it seems to be acting that way in your case. I'm pretty certain there is no way to create a logical OR in the GET query string.

django internalization in urls? how to make urls like this: "en/articles" and "pt/artigos"...?

Hy!
I would need to make urls based on language.
Example:
If I had the language english and subcategory named "articles" then the url might be like this:
/en/articles/...
If the language were portuguese and subcategory already translated is "artigos" then the url will be:
/pt/artigos/...
How can I do it?
I must use the urls.py or some monkeypatches?
thanks
This features is already existing in the yet-to-be released version 1.4. You can read about it in the release note.
If your really need this feature, and no existing app feets your needs, you can still try to apply the corresponding patch yourself.
Django LocaleURL is a piece of middleware that does exactly this. The documentation can be found in the source, or online.
Edit: I read over the fact that you want to translate the url itself... I'm not aware of any piece of code that provides this. Perhaps you could extend the LocalURL middleware to take care of the translations for you. Say you have a regex match like (?P<articles>\w+), you could in the middleware determine which view you want to use. Something like the following mapping perhaps?
if article_slug in ['articles', 'artigos', 'article']:
article(request) # Call the article view
I've been using transurlvania with great success, it does exactly what you need and more, however i see that in the next Django release django-i18nurls will be included in django core so perhaps it would be better to learn that

Inexact full-text search in PostgreSQL and Django

I'm new to PostgreSQL, and I'm not sure how to go about doing an inexact full-text search. Not that it matters too much, but I'm using Django. In other words, I'm looking for something like the following:
q = 'hello world'
queryset = Entry.objects.extra(
where=['body_tsv ## plainto_tsquery(%s)'],
params=[q])
for entry in queryset:
print entry.title
where I the list of entries should contain either exactly 'hello world', or something similar. The listings should then be ordered according to how far away their value is from the specified string. For instance, I would like the query to include entries containing "Hello World", "hEllo world", "helloworld", "hell world", etc., with some sort of ranking indicating how far away each item is from the perfect, unchanged query string.
How would you go about doing this?
Your best bet is to use Django raw querysets, I use it with MySQL to perform full text matching. If the data is all in the database and Postgres provides the matching capability then it makes sense to use it. Plus Postgres offers some really useful things in terms of stemming etc with full text queries.
Basically it lets you write the actual query you want yet returns models (as long as you are querying a model table obviously).
The advantage this gives you is that you can test the exact query you will be using first in Postgres, the documentation covers full text queries pretty well.
The main gotcha with raw querysets at the moment is they don't support count. So if you will be returning lots of data and have memory constraints on your application you might need to do something clever.
"Inexact" matching however isn't really part of the full text searching capabilities. Instead you want the postgres fuzzystrmatch contrib module. It's use is described here with indexes.
The best would be to use a search engine for this purpose. Django-haystack supports the integration of three different search engines.
In 2022, Django supports full text search with postgres. Full documentation here: https://docs.djangoproject.com/en/4.0/ref/contrib/postgres/search/