Elasticsearch template not reading params - templates

I've been following the example at this site Parameterizing Queries in Solr and Elasticsearch in the ES section. Note, it is an older ES version that the author is working with, but I don't think that should affect this situation. I am using version 1.6 of ES.
I have a template {{ES_HOME}}/config/scripts/test.mustache that looks like the below snippet. Note the {{q}} parameter in "query".
{
"query": {
"multi_match": {
"query": "{{q}}",
"analyzer": "keyword",
"fields": [
"description^10",
"name^50",
]
}
},
"aggregations": {
"doctype" : {
"terms" : {
"field" : "doctype.untouched"
}
}
}
}
I POST to http://localhost:9200/forward/_search/template with the following message body
{
"template": {
"file": "test",
"params": {
"q": "a"
}
}
}
It runs the template, but gets 0 hits, and returns the following:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
},
"aggregations": {
"doctype": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
},
}
}
Alternatively, if I hard code "a" into where {{q}} is, and post to the template, it correctly returns results for the query "a". Am I doing something inherently wrong in my design?

According to the docs, the params object should be outside of the template object
{
"template": {
"file": "test"
},
"params": {
"q": "a"
}
}

Related

How to apply custom score to a search filed in Elastic Search

I am making a search query in Elastic Search and I want to treat the fields the same when they match. For example if I search for field field1 and it matches, then the _score is increase by 10(for example), same for the field2.
I was tried function_score but it's not working. It throws an error.
"caused_by": {
"type": "class_cast_exception",
"reason": "class
org.elasticsearch.index.fielddata.plain.SortedSetDVOrdinalsIndexFieldData
cannot be cast to class
org.elasticsearch.index.fielddata.IndexNumericFieldData
(org.elasticsearch.index.fielddata.plain.SortedSetDVOrdinalsIndexFieldData
and org.elasticsearch.index.fielddata.IndexNumericFieldData are in unnamed
module of loader 'app')"
}
The query:
{
"track_total_hits": true,
"size": 50,
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"term": {
"field1": {
"value": "Value 1"
}
}
},
{
"term": {
"field2": {
"value": "value 2"
}
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "field1",
"factor": 10,
"missing": 0
}
},
{
"field_value_factor": {
"field": "field2",
"factor": 10,
"missing": 0
}
}
],
"boost_mode": "multiply"
}
}
}
You can use function score with filter function to boost.
assuming that your mapping looks like the one below
{
"mappings": {
"properties": {
"field_1": {
"type": "keyword"
},
"field_2": {
"type": "keyword"
}
}
}
}
with documents
{"index":{}}
{"field_1": "foo", "field_2": "bar"}
{"index":{}}
{"field_1": "foo", "field_2": "foo"}
{"index":{}}
{"field_1": "bar", "field_2": "bar"}
you can use weight parameter to boost the documents matched for each query.
{
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [
{
"filter": {
"term": {
"field_1": "foo"
}
},
"weight": 10
},
{
"filter": {
"term": {
"field_2": "foo"
}
},
"weight": 20
}
],
"score_mode": "multiply"
}
}
}
You can refer below solution if you want to provide manual weight for different field in query. This will always replace highest weight field on top of your query response -
Elasticsearch query different fields with different weight

How do I get only the element values that match in the list in the Elastic Search?

[Hi, there]
I want to create an ES query that only retrieves certain elements that match in the list.
Here is my ES index schema.
"test-es-2018":{
"aliases": {},
"mappings": {
"test-1": {
"properties": {
"categoryName": {
"type": "keyword",
"index": false
},
"genDate": {
"type": "date"
},
"docList": {
"properties": {
"rank": {
"type": "integer",
"index": false
},
"doc-info": {
"properties": {
"docId": {
"type": "keyword"
},
"docName": {
"type": "keyword",
"index": false
},
}
}
}
},
"categoryId": {
"type": "keyword"
},
}
}
}
}
There are documents listed in the category. Documents in the list have their own information.
*search query in Kibana.
source": {
"categoryName" : "food" ,
"genDate" : 1577981646638,
"docList" [
{
"rank": 2,
"doc-info": {...}
},
{
"rank": 1,
"doc-info": {...}
},
{
"rank": 5,
"doc-info": {...}
},
],
"categoryId": "201"
}
First, I want to get only the element value that match in the list.
I would like to see only documents with rank 1 in the list. However, if I query using match as below, the result is the same as *search query in kibana.
*match query in Kibana.
GET test-es-2018/_search
{
"query": {
"bool": {
"must": [
{ "match": { "docList.rank": 1 } },
]
}
}
}
In my opinion, it seems to print the entire list because it contains a document with rank one.
What I want is:
source": {
"categoryName" : "food" ,
"genDate" : 1577981646638,
"docList" [
{
"rank": 1,
"doc-info": {...}
},
],
"categoryId": "201"
}
Is this possible?
Second, I want to sort the docList by rank. I tried sorting by creating a query like the following, but it was not sorted.
*sort query in Kibana.
GET test-es-2018/_search?
{
"query" : {
"bool" : {...}
},
"sort" : [
{
"docList.rank" : {
"order" : "asc"
}
}
]
}
What I want is:
source": {
"categoryName" : "food" ,
"genDate" : 1577981646638,
"docList" [
{
"rank": 1,
"doc-info": {...}
},
{
"rank": 2,
"doc-info": {...}
},
{
"rank": 5,
"doc-info": {...}
},
],
"categoryId": "201"
}
I do not know how to access the list. Is there a good idea for both of these issues?
In general you could use source filter to retrieve only part of the document but this way it's not possible to exclude some fields based on their values.
As far as I know Elasticsearch doesn't support changing order of field values in the _source. Partly the desired result can be achieved by using nested fields along with inner_hits -> sort query expression. This way sorted subhits will be returned in the inner_hits section of the response.
P.S. Typically working with Elasticsearch you should consider indexed document as the smallest indivisible search unit.

Regexp vs Include performance comparison in Elasticsearch

I work on a project and I need to aggregate the results based on "created" and "labels" field.
I created following queries that both give the result as I expected. But I want to learn that which query runs more fast?
My first query:
{
"size": 0,
"aggs": {
"HEATMAP": {
"date_histogram": {
"field": "created",
"interval": "day"
},
"aggs": {
"BEHAVIOUR_CHANGE": {
"terms": {
"field": "labels",
"include": "behavior-change"
}
},
"FIRST_OCCURRENCE": {
"terms": {
"field": "labels",
"include": "first-occurrence"
}
}
}
}
}
}
My second query:
{
"size": 0,
"aggs": {
"HEATMAP": {
"date_histogram": {
"field": "created",
"interval": "day"
},
"aggs": {
"BEHAVIOUR_CHANGE": {
"filter": {
"regexp": {
"labels": "behavior-change"
}
}
},
"FIRST_OCCURRENCE": {
"filter": {
"regexp": {
"labels": "first-occurrence"
}
}
}
}
}
}
}
Since that field is a keyword and you don't need anything special when it comes to a regular expression (only a perfect match), I would do it like the following. You'd note also that I added a terms filter to the query part to try and narrow down the results before being put through the aggregations (theoretically, for the aggregations to have less work to do). Also, I don't see a reason to use regexp here, thus I used the terms aggregations. If you are really interested in the performance comparison, I'd suggest setting up a load test with many more documents and terms in that field and perform some tests. Elastic has its own benchmarking tool that you could use for this: Rally.
{
"size": 0,
"query": {
"terms": {
"labels": [
"behavior-change",
"first-occurrence"
]
}
},
"aggs": {
"HEATMAP": {
"date_histogram": {
"field": "created",
"interval": "day"
},
"aggs": {
"BEHAVIOUR_CHANGE": {
"terms": {
"field": "labels",
"include": "behavior-change"
}
},
"FIRST_OCCURRENCE": {
"terms": {
"field": "labels",
"include": "first-occurrence"
}
}
}
}
}
}

Elasticsearch - Tokenizer configuration

Someone have any idea of what tokenizer to use and how to enable rule for the below,
Input : ["test1-data.example.com", "test2-new.example.com", "new1-test.example.com"]
Output (expected ) :
test1-data.example.com test2-new.example.com new1-test.exampl.com
It's not obvious whether it solves your problem or not, but here's one way you can do what it sounds like you're asking:
DELETE /test_index
PUT /test_index
{
"settings": {
"number_of_shards": 1
},
"mappings": {
"doc": {
"_all": {
"enabled": true,
"store": true,
"index": "not_analyzed"
},
"properties": {
"text_field": {
"type": "string",
"include_in_all": true
}
}
}
}
}
PUT /test_index/doc/1
{
"text_field": ["test1-data.example.com", "test2-new.example.com", "new1-test.example.com"]
}
POST /test_index/_search
{
"fields": [
"_all"
]
}
...
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 1,
"fields": {
"_all": "test1-data.example.com test2-new.example.com new1-test.example.com "
}
}
]
}
}
Here's the code in Sense:
http://sense.qbox.io/gist/45200711a41268634439b669e18541e68042ac8a

MongoDB Aggregate Regex Match or Full Text Search returns whole Document

Ex. Record
[
{
"_id": "5528cfd2e71144e020cb6494",
"__v": 11,
"Product": [
{
"_id": "5528cfd2e71144e020cb6495",
"isFav": true,
"quantity": 27,
"price": 148,
"description": "100g",
"brand": "JaldiLa",
"name": "Grapes",
"sku": "GRP"
},
{
"_id": "552963ed63d867b81e18d357",
"isFav": false,
"quantity": 13,
"price": 290,
"description": "100g",
"brand": "JaldiLa",
"name": "Apple",
"sku": "APL"
}
],
"brands": [
"Whole Foods",
"Costco",
"Bee's",
"Masons"
],
"sku": "FRT",
"name": "Fruits"
}
]
My Mongoose function to return query from AngularJS(http://localhost:8080/api/search?s=)
router.route('/search')
.get(function(req, res) {
Dept.aggregate(
{ $match: { $text: { $search: req.query.s } } },
{ $project : { name : 1, _id : 1, 'Product.name' : 1, 'Product._id' : 1} },
{ $unwind : "$Product" },
{ $group : {
_id : "$_id",
Category : { $addToSet : "$name"},
Product : { $push : "$Product"}
}}
)
});
RESULT: e.g. http://localhost:8080/api/search?s=Apple / Grape / Carrot, result is same for all.
[
{
"_id": "5528cfd2e71144e020cb6494",
"Category": ["Fruits"],
"Product": [
{
"_id": "5528cfd2e71144e020cb6495",
"name": "Grapes"
},
{
"_id": "552963ed63d867b81e18d357",
"name": "Apple"
},
{
"_id": "552e61920c530fb848c61510",
"name": "Carrots"
}
]
}
]
PROBLEM: On a query of "apple", it returns all objects within Product instead of just "grapes", i think maybe putting match after unwind would do the trick or $regex case
WHAT I WANT: e.g. for a searchString of "grape"
Also I want it to start sending results as soon as I send in the first two letters of my query.
[{
"_id": ["5528cfd2e71144e020cb6494"], //I want this in array as it messes my loop up
"Category": "Fruits", //Yes I do not want this in array like I'm getting in my resutls
"Product": [{
"_id": "5528cfd2e71144e020cb6495",
"name": "Grapes"
}]
}]
Thanks for being patient.
Use the following aggregation pipeline:
var search = "apple",
pipeline = [
{
"$match": {
"Product.name": { "$regex": search, "$options": "i" }
}
},
{
"$unwind": "$Product"
},
{
"$match": {
"Product.name": { "$regex": search, "$options": "i" }
}
},
{
"$project": {
"Category": "$name",
"Product._id": 1,
"Product.name": 1
}
}
];
db.collection.aggregate(pipeline);
With the above sample document and a regex (case-insensitive) search for "apple" on the name field of the Product array, the above aggregation pipeline produces the result:
Output:
/* 1 */
{
"result" : [
{
"_id" : "5528cfd2e71144e020cb6494",
"Product" : {
"_id" : "552963ed63d867b81e18d357",
"name" : "Apple"
},
"Category" : "Fruits"
}
],
"ok" : 1
}