Solr: List results by distance - django

I'd like to pass some parameters to Solr that should afflict the weighting of the results (I do not want to filter away results that do not match these criterias).
E.g. I'd like to have a language attribute, and if i pass the user's language to the search engine I'd like to have the results matching the language listed first. As a newbie to Solr I'd like to know if and how this is possible!

Yes, that's possible by using boost functions. See this FAQ entry or the description of boost functions for the DisMaxQueryPlugin (the dismax query parser is the default parser).

Related

Django-dsl-drf Exclude phrase query

I am working on integrating Elastic Search in my existing Django REST application. I am using the django-dsl-drf module provided in the link below:
https://django-elasticsearch-dsl-drf.readthedocs.io/
In their documentation 'exclude' query param is provided. But the query only when we provide the full field value.
search-url?exclude=<field-value
For eg: If I have a value 'Stackoverflow' in field 'name'. I'll have to provide query param a
?name__exclude=Stackoverflow to exclude records having 'Stackoverflow' as name in the result. I would like to implement a search in such a way that when I provide 'over', I need to exclude these records, similar to ?name__exclude=over
I checked the above tutorial, but I couldn't find it. Is there any work around so that I can exclude records, fields containing terms instead of providing full field value, which is also case-insensitive.
Thanks a lot.
Using the contains functional filter, you can target documents that have their name field value containing the characters over anywhere in their terms:
?name__contains=over
However, as far as I know, there is no way to negate that filter in django-dsl-drf. You can create an issue requesting that feature, though, because odds are high that you're not the only who needs that, since it's a pretty common way of searching.

haystack query for solr to find items where attribute is unset or specified value

I'm trying to query solr through haystack for all objects that either does not have an attribute (it's Null) or the attribute is a specified value.
I can query solr directly with the snippet (brand:foo OR (*:* -brand:*)) and get what I want. But I can't find a way to formulate this or anything logically the same through haystack without really ugly hacks.
I did find this ugly hack:
SearchQuerySet().filter(brand=Raw('%s OR (*:* -brand:*)' % Clean('foo'))
But it chains really poorly with that OR in there without any parenthesis around it.
Ideally a solution using a pure filter would be best, but failing that a way to add a chainable filter using raw solr query language.
I'm using django-haystack 2.4.0
It's not a perfect match, but narrow helps me enough to let me do what I want
SearchQuerySet().narrow('(brand:%s OR (*:* -brand:*))' % Clean('foo'))

index analyzer vs query analyzer in haystack - elasticsearch?

Elasticsearch itself seems to support index-analyzer and query-analyzer,
but haystack's elasticsearch doesn't seem to differentiate them.
Am I Correct?
related question is,
Elasticsearch's DEFAULT_SETTING seems to have 'settings.analysis.anaylyzer' and 'index.analysys.anaylyzer'. (eg. http://www.wellfireinteractive.com/blog/custom-haystack-elasticsearch-backend/ has 'index') What's the difference between them?
With haystack, you want to set the mappings yourself.
I wrote about haystack as well earlier here: Django Haystack Distinct Value for Field
In the settings, you can define analyzers on a per field basis, they can be a default analyzer (which is what haystack defaults to and get's applied at both search and index time) a search time analyzer and a query time analyzer.
It's usually good practice to define both a search time analyzer and index time analyzer, even if they are the exact same.
Using snowball text analyses, you might want to apply this at both search and index time, but something like an autocomplete feature, you might not want that (which is what haystack does). You want the index analyzer to store (edge)ngrams and usually you want to apply a stricter search time analysis, like keyword.
You almost never want to let haystack define the mapping.
As for the second part, see here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html
Mid way down it says:
"Note you do not have to explicitly specify index section inside
settings section."
I just tried this myself as well, because I had never tested it.

Using django-haystack, how do I perform a search with only partial terms?

I've got a Haystack/xapian search index for django.contrib.auth.models.User. The template is simply
{{object.get_full_name}}
as I intend for a user to type in a name and be able to search for it.
My issue is this: if I search, say, Sri (my full first name) I come up with a result for the user object pertaining to my name. However, if I search Sri Ragh - that is, my full name, and part of my last name, I get no results.
How can I set Haystack up so that I can get the appropriate results for partial queries?
(I essentially want it to search *Sri Ragh*, but I don't know if wildcards would actually do the trick, or how to implement them).
This is my search query:
results = SearchQuerySet().filter(content='Sri Ragh')
I use to have a similar problem, as workaround or maybe a Fix you can change the query lookup
results = SearchQuerySet().filter(content__startswith='Sri Ragh')
The issue is that django-haystack doesn't implement all lingos from search engines. Of course you can do this.
results = SearchQuerySet().raw_search('READ THE SEARCH ENGINE QUERY SYNTAX FOR GET WILDCARD LOOKUPS')
As Django-haystack says, this is not portable.
You can use icontains or startswith.
Be careful with this one, if a query is for example 'r', this will bring you all 'Model' entities that have a 'r' in its content.
Model.objects.filter(content__icontains=query)
Model.objects.filter(content__startswith=query)
Look at the documentation

Solr Query Syntax

I just got started looking at using Solr as my search web service. I don't know whether Solr supports these query types:
Startswith
Exact Match
Contain
Doesn't Contain
In the range
Could anyone guide me how to implement those features in Solr?
Cheers,
Samnang
Solr is capable of all those things but to adequately explain how to do each of time an answer would become a mini-manual for Solr.
I'd suggest you read the actual manual and tutorials linked from the Solr homepage.
In short though:
Startswith can be implemented using Lucene wildcards.
Exact matches will only be found if a field is not tokanized. I.e. the entire field is viewed as a single token.
Contain is the default search format. I.e. a search for "John" will find any document's whose search field contains the value "John". Prefixing with - (e.g. "-John" will only find documents that do not contain John).
Ranges (be they date or integer) are possible and quite powerful, example date:[* TO NOW] would find any document whose date is not in the future.