MongoDB Aggregate Regex Match or Full Text Search returns whole Document - regex

Ex. Record
[
{
"_id": "5528cfd2e71144e020cb6494",
"__v": 11,
"Product": [
{
"_id": "5528cfd2e71144e020cb6495",
"isFav": true,
"quantity": 27,
"price": 148,
"description": "100g",
"brand": "JaldiLa",
"name": "Grapes",
"sku": "GRP"
},
{
"_id": "552963ed63d867b81e18d357",
"isFav": false,
"quantity": 13,
"price": 290,
"description": "100g",
"brand": "JaldiLa",
"name": "Apple",
"sku": "APL"
}
],
"brands": [
"Whole Foods",
"Costco",
"Bee's",
"Masons"
],
"sku": "FRT",
"name": "Fruits"
}
]
My Mongoose function to return query from AngularJS(http://localhost:8080/api/search?s=)
router.route('/search')
.get(function(req, res) {
Dept.aggregate(
{ $match: { $text: { $search: req.query.s } } },
{ $project : { name : 1, _id : 1, 'Product.name' : 1, 'Product._id' : 1} },
{ $unwind : "$Product" },
{ $group : {
_id : "$_id",
Category : { $addToSet : "$name"},
Product : { $push : "$Product"}
}}
)
});
RESULT: e.g. http://localhost:8080/api/search?s=Apple / Grape / Carrot, result is same for all.
[
{
"_id": "5528cfd2e71144e020cb6494",
"Category": ["Fruits"],
"Product": [
{
"_id": "5528cfd2e71144e020cb6495",
"name": "Grapes"
},
{
"_id": "552963ed63d867b81e18d357",
"name": "Apple"
},
{
"_id": "552e61920c530fb848c61510",
"name": "Carrots"
}
]
}
]
PROBLEM: On a query of "apple", it returns all objects within Product instead of just "grapes", i think maybe putting match after unwind would do the trick or $regex case
WHAT I WANT: e.g. for a searchString of "grape"
Also I want it to start sending results as soon as I send in the first two letters of my query.
[{
"_id": ["5528cfd2e71144e020cb6494"], //I want this in array as it messes my loop up
"Category": "Fruits", //Yes I do not want this in array like I'm getting in my resutls
"Product": [{
"_id": "5528cfd2e71144e020cb6495",
"name": "Grapes"
}]
}]
Thanks for being patient.

Use the following aggregation pipeline:
var search = "apple",
pipeline = [
{
"$match": {
"Product.name": { "$regex": search, "$options": "i" }
}
},
{
"$unwind": "$Product"
},
{
"$match": {
"Product.name": { "$regex": search, "$options": "i" }
}
},
{
"$project": {
"Category": "$name",
"Product._id": 1,
"Product.name": 1
}
}
];
db.collection.aggregate(pipeline);
With the above sample document and a regex (case-insensitive) search for "apple" on the name field of the Product array, the above aggregation pipeline produces the result:
Output:
/* 1 */
{
"result" : [
{
"_id" : "5528cfd2e71144e020cb6494",
"Product" : {
"_id" : "552963ed63d867b81e18d357",
"name" : "Apple"
},
"Category" : "Fruits"
}
],
"ok" : 1
}

Related

How to apply custom score to a search filed in Elastic Search

I am making a search query in Elastic Search and I want to treat the fields the same when they match. For example if I search for field field1 and it matches, then the _score is increase by 10(for example), same for the field2.
I was tried function_score but it's not working. It throws an error.
"caused_by": {
"type": "class_cast_exception",
"reason": "class
org.elasticsearch.index.fielddata.plain.SortedSetDVOrdinalsIndexFieldData
cannot be cast to class
org.elasticsearch.index.fielddata.IndexNumericFieldData
(org.elasticsearch.index.fielddata.plain.SortedSetDVOrdinalsIndexFieldData
and org.elasticsearch.index.fielddata.IndexNumericFieldData are in unnamed
module of loader 'app')"
}
The query:
{
"track_total_hits": true,
"size": 50,
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"term": {
"field1": {
"value": "Value 1"
}
}
},
{
"term": {
"field2": {
"value": "value 2"
}
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "field1",
"factor": 10,
"missing": 0
}
},
{
"field_value_factor": {
"field": "field2",
"factor": 10,
"missing": 0
}
}
],
"boost_mode": "multiply"
}
}
}
You can use function score with filter function to boost.
assuming that your mapping looks like the one below
{
"mappings": {
"properties": {
"field_1": {
"type": "keyword"
},
"field_2": {
"type": "keyword"
}
}
}
}
with documents
{"index":{}}
{"field_1": "foo", "field_2": "bar"}
{"index":{}}
{"field_1": "foo", "field_2": "foo"}
{"index":{}}
{"field_1": "bar", "field_2": "bar"}
you can use weight parameter to boost the documents matched for each query.
{
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [
{
"filter": {
"term": {
"field_1": "foo"
}
},
"weight": 10
},
{
"filter": {
"term": {
"field_2": "foo"
}
},
"weight": 20
}
],
"score_mode": "multiply"
}
}
}
You can refer below solution if you want to provide manual weight for different field in query. This will always replace highest weight field on top of your query response -
Elasticsearch query different fields with different weight

Query with id, nested array and range in Elastic Search (Open Search AWS)

I have a ES document like below :
{
"_id" : "test#domain.com",
"age" : 12,
"hobbiles" : ["Singing", "Dancing"]
},
{
"_id" : "test1#domain.com",
"age" : 7,
"hobbiles" : ["Coding", "Chess"]
}
I am storing email as id, age and hobbiles, hobbies is nested type, age is long I want to query with id, age and hobbiles, something like below :
Select * FROM tbl where _id IN ('val1', 'val2') AND age > 5 AND hobbiles should match with Chess or Dancing
How can I do in Elastic Search ? I am using OpenSearch 1.3 (latest) : AWS
I will suspect that field hobbiles is keyword, then the query suggested:
PUT test
{
"mappings": {
"properties": {
"age": {
"type": "long"
},
"hobbiles": {
"type": "keyword"
}
}
}
}
POST test/_doc/test#domain.com
{
"age": 12,
"hobbiles": [
"Singing",
"Dancing"
]
}
POST test/_doc/test1#domain.com
{
"age": 7,
"hobbiles": [
"Coding",
"Chess"
]
}
GET test/_search
{
"query": {
"bool": {
"filter": [
{
"terms": {
"_id": [
"test1#domain.com",
"test#domain.com"
]
}
}
],
"must": [
{
"range": {
"age": {
"gt": 5
}
}
},
{
"terms": {
"hobbiles": [
"Coding",
"Chess"
]
}
}
]
}
}
}

Elastic Search Sort

I have a table for some activities like
[
{
"id": 123,
"name": "Ram",
"status": 1,
"activity": "Poster Design"
},
{
"id": 123,
"name": "Ram",
"status": 1,
"activity": "Poster Design"
},
{
"id": 124,
"name": "Leo",
"categories": [
"A",
"B",
"C"
],
"status": 1,
"activity": "Brochure"
},
{
"id": 134,
"name": "Levin",
"categories": [
"A",
"B",
"C"
],
"status": 1,
"activity": "3D Printing"
}
]
I want to get this data from elastic search 5.5 by sorting on field activity, but I need all the data corresponding to name = "Ram" first and then remaining in a single query.
You can use function score query to boost the result based on match for the filter(this case ram in name).
Following query should work for you
POST sort_index/_search
{
"query": {
"function_score": {
"query": {
"match_all": {}
},
"boost": "5",
"functions": [{
"filter": {
"match": {
"name": "ram"
}
},
"random_score": {},
"weight": 1000
}],
"score_mode": "max"
}
},
"sort": [{
"activity.keyword": {
"order": "desc"
}
}]
}
I would suggest using a bool query combined with the should clause.
U will also need to use the sort clause on your field.

ElasticSearch - copy_to with dynamic template

Following up my previous question: ElasticSearch overriding mapping from text to object
I have an index template:
{
"template" : "project.*",
"order" : 100,
"dynamic_templates": [
{
"message_field": {
"mapping": {
"type": "object"
},
"match": "message"
},
"message_properties": {
"path_match": "message.*",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}
]
}
which basically creates new fields for everything under "message" field. I am doing this because "message" field is mapped as a string in another index template and I am overriding it.
Sample document:
{
"level": "30",
...
"kubernetes": {
"container_name": "data-sync-server",
"namespace_name": "alitest03",
...
},
"message": {
"tag": "AUDIT",
"requestId": 1234,
...
},
}
...
}
This works fine, but it ends up creating top level fields like "tag" and "requestId".
I don't want to pollute the top level and would like to have fields like "audit.tag", "audit.requestId".
Tried using copy_to like this, but I don't see any "audit.*" fields:
{
"template" : "project.*",
"order" : 100,
"dynamic_templates": [
{
"message_field": {
"mapping": {
"type": "object"
},
"match": "message"
},
"message_properties": {
"path_match": "message.*",
"mapping": {
"type": "string",
"index": "not_analyzed",
"copy_to" : "audit.{name}"
}
}
}
]
}
A sample search result when using the template above with copy_to is below. I don't see any "audit.*" fields.
{
"timestamp": "October 15th 2018, 15:46:15.994",
"_id": "YmI1NDRjMTgtZTY3Ni00ZGUxLTk2NDMtOTJhZjk3ZWU1YTJj",
"_index": "project.alitestproj02.aa564e69-c643-11e8-af2a-fa163e4c9c9e.2018.10.15",
"_score": "",
"_type": "com.redhat.viaq.common",
...
"kubernetes.container_name": "data-sync-server",
"kubernetes.namespace_name": "alitestproj02",
...
"message": "{\"level\":30,\"time\":1539607575994,\"pid\":19,\"hostname\":\"data-sync-server-6-pxcsm\",\"tag\":\"AUDIT\",\"msg\":\"\",\"requestId\":20355,\"operationType\":\"query\",\"parentTypeName\":\"Meme\",\"path\":\"allMemes.866.owner\",\"success\":true,\"parent\":{\"_type\":\"meme\",\"photourl\":\"photo472\",\"owner\":\"owner35\",\"likes\":0,\"_id\":\"zzEnLAQmQeuTC1mj\",\"createdAt\":\"2018-10-15T11:58:33.896Z\",\"updatedAt\":\"2018-10-15T11:58:33.896Z\",\"id\":\"zzEnLAQmQeuTC1mj\"},\"arguments\":{},\"dataSourceType\":\"InMemory\",\"v\":1}\n",
"requestId": "20355",
"tag": "AUDIT",
...
"v": 1
}

Elasticsearch filter (numeric field) returns nothing

Type mapping
{
"pois-en": {
"mappings": {
"poi": {
"properties": {
"address": {
"type": "string",
"analyzer": "portuguese"
},
"city": {
"type": "integer"
},
(...)
"type": {
"type": "integer"
}
}
}
}
}
}
Query all:
GET pois-en/_search
{
"query":{
"match_all":{}
},
"fields": ["city"]
}
returns:
"hits": [
{
"_index": "pois-en",
"_type": "pois_poi",
"_id": "491",
"_score": 1,
"fields": {
"city": [
91
]
}
},
(...)
But when i filter using:
GET pois-en/_search
{
"query" : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"term" : {
"city" : 91
}
}
}
}
}
Its returns nothing!
I can't figure out what i'm doing wrong.
To Django and Elasticsearch communication i'm Elasticutils (https://github.com/mozilla/elasticutils) but i'm using Sense now to make those queries.
Thanks in advance
The type name isn't consistent in your post (poi and pois_poi) - the returned document doesn't match your mapping.