Elastic search 5, search from list by sublist - list

I'm trying to search from an object that has a list property.
I need to be able to select all object that contains all sublist items.
ex :
If my object has [A,B,C] it should be returned for the given querys :
[A], [A,B], [A,B,C], [A,C], [C,A] ... (Input order doesn't have to match)
But if the sublist contains any element that is not part of the object list, it should not be returned.
ex :
[D], [A,D] ...
Those querys should not be valid.
I've managed to do it for the query with an existing sublist, but not when any item of the sublist doesn't exists.
Any ideas ?
Thanks !

Use comma seperate for sublist query item as a value for match query and set operator value to "and" as following:
Sample of document:
{
"Id": 1,
"Name": "One",
"tags": ["A","B","C"]
}
For sublist:[A,B]:
{
"query": {
"match": {
"tags": {
"query": "A,B",
"operator": "and"
}
}
}
}
I test in ElasticSearch 5.6.0 and 6.1.2

Assuming A, B, C, etc are mapped as keyword types, multiple bool query filter clauses would be one way
var response = client.Search<User>(s => s
.Query(q => +q
.Term(f => f.Badges, "A") && +q
.Term(f => f.Badges, "B") && +q
.Term(f => f.Badges, "C")
)
);
generates the following query
{
"query": {
"bool": {
"filter": [
{
"term": {
"badges": {
"value": "A"
}
}
},
{
"term": {
"badges": {
"value": "B"
}
}
},
{
"term": {
"badges": {
"value": "C"
}
}
}
]
}
}
}
A user document would need to have at least all of A, B and C badges to be considered a match.
A user document may well have other badges in addition to A, B and C; if you need to find documents that have exactly A, B and C, take a look at the terms_set query with a minimum_should_match* value set to the number of passed terms.

Related

how to extract data inside the list of maps and convert it into maps in dart

How can extract data inside the list of maps and convert it into maps in the dart.
Like I have a List of maps ================================================================================================================================================================================================
[
{"business_id":"2",
"business_title":"Spotify",
"business_phone":"(055) 3733783",
"business_email":"Usamashafiq309#gmail.com",
"business_website":"www.spotify.com",
"business_address":"Spotify AB, Regeringsgatans bro, Stockholm, Sweden",
"business_address_longitude":"18.0680873",
"business_address_latitude":"59.33096949999999",
"business_image":"5f84c7a4bbbd0121020201602537380.png",
"business_created_at":"2020-10-20 15:40:17",
"business_category_id":"2",
"cat_id":"2",
"cat_title":"Gym",
"cat_image":"280920201601308237.png"}
,{"business_id":"2",
"business_title":"Spotify",
"business_phone":"(055) 3733783",
"business_email":"Usamashafiq309#gmail.com",
"business_website":"www.spotify.com",
"business_address":"Spotify AB, Regeringsgatans bro, Stockholm, Sweden",
"business_address_longitude":"18.0680873",
"business_address_latitude":"59.33096949999999",
"business_image":"5f84c7a4bbbd0121020201602537380.png",
"business_created_at":"2020-10-20 15:40:17",
"business_category_id":"2",
"cat_id":"2",
"cat_title":"Gym",
"cat_image":"280920201601308237.png"}
]
and convert it like this
[ {"business_id":"2",
"business_title":"Spotify",},
{"business_id": "1",
"business_title": "Pizza Hut",},
]
You can use the map function to apply a function to each element of a list. Then you can create a submap with your map.
Here is a quick exemple:
void main() async {
List l = [
{
"business_id": "2",
"business_title": "Spotify",
"business_phone": "(055) 3733783",
},
{
"business_id": "1",
"business_title": "Pizza Hut",
"business_phone": "(055) 9999999",
}
];
print(extractMap(l));
}
List extractMap(List l) {
return l
.map((element) => Map.fromEntries([
MapEntry('business_id', element['business_id']),
MapEntry('business_title', element['business_title']),
]))
.toList();
}

elasticsearch in json string (and / or )

I am new to AWS elasticsearch but need to create queries to search the follow data with different criteria.
search_metadata (json string with key/value pair) - "{\"number\":\"111\"; \"area\":\"central\"; "\code\":\"1111\"; \"type\":\"internal\"}"
category - "statement" or "bill" or "email"
datetime - "2019-05-04T00:00:00" or "2019-07-16T00:01:00"
flag - "good" or "bad"
I need to construct query to do the following
AND or OR condition in search_metadata field (JSON string) -> not sure how to do it.
along with AND condition for category, datetime range and flag. -> Do I need to use muliti-match for flag and category ?
"query": {
"bool": {
"must": [
{
"match_phrase": {
"search_metadata": "number 111" --> not sure about AND or OR with "area" and others
}
},
{
"range": {
"datetime": {
"gte": "2019-05-04T00:00:00Z",
"lte": "2019-07-16T00:01:00Z"
}
}
}
]
}
}
}

mapreduce in couchDB and getting the MAX result after mapreduce

I am a beginner to couchDB.
I have data as below:
one:[{
"name":abc,
"value":1
},
{
"name":efg,
"value":1
},
{
"name":abc,
"value":1
},
I would like to get the count of similar keys and get the maximum.
e.g. in my case "abc" is twice. so the maximum(reduce function) should return
result: {"name":abc,value:2}
Did you try this design document:
{
"_id":"_design/company",
"views":
{
"abc_customers": {
"map": "function(doc) { if (doc.name == 'abc') emit(doc.name,doc.value) }",
"reduce" : "_count"
},
"efg_customers": {
"map": "function(doc) { if (doc.name == 'efg') emit(doc.name,doc.value) }",
"reduce" : "_count"
}
}
}
See this one for a comparison of _count and _sum. Possibly you need _count.
By the above CouchDB design document, you will get count for each doc.name of abc, efg, ... and then you can do a search for max/min of counts.

MongoDB Search and Sort, with Number of Matches and Exact Match

I want to create a small MongoDB Search Query where I want to sort the result set based exact match followed by no. of matches.
For eg. if I have following labels
Physics
11th-Physics
JEE-IIT-Physics
Physics-Physics
Then, if I search for "Physics" it should sort as
Physics
Physics-Physics
11th-Physics
JEE-IIT-Physics
Looking for the sort of "scoring" you are talking about here is an excercise in "imperfect solutions". In this case, the "best fit" here starts with "text search", and "imperfect" is the term to consider first when working with the text search capabilties of MongoDB.
MongoDB is "not" a dedicated "text search" product, nor is it ( like most databases ) trying to be one. Full capabilites of "text search" is reserved for dedicated products that do that as there area of expertise. So maybe not the best fit, but "text search" is given as an option for those who can live with the limitations and don't want to implement another engine. Or Yet! At least.
With that said, let's look at what you can do with the data sample as given. First set up some data in a collection:
db.junk.insert([
{ "data": "Physics" },
{ "data": "11th-Physics" },
{ "data": "JEE-IIT-Physics" },
{ "data": "Physics-Physics" },
{ "data": "Something Unrelated" }
])
Then of course to "enable" the text search capabilties, then you need to index at least one of the fields in the document with the "text" index type:
db.junk.createIndex({ "data": "text" })
Now that is "ready to go", let's have a look at a first basic query:
db.junk.find(
{ "$text": { "$search": "\"Physics\"" } },
{ "score": { "$meta": "textScore" } }
).sort({ "score": { "$meta": "textScore" } })
That is going to give results like this:
{
"_id" : ObjectId("55af83b964876554be823f33"),
"data" : "Physics-Physics",
"score" : 1.5
}
{
"_id" : ObjectId("55af83b964876554be823f30"),
"data" : "Physics",
"score" : 1
}
{
"_id" : ObjectId("55af83b964876554be823f31"),
"data" : "11th-Physics",
"score" : 0.75
}
{
"_id" : ObjectId("55af83b964876554be823f32"),
"data" : "JEE-IIT-Physics",
"score" : 0.6666666666666666
}
So that is "close" to your desired result, but of course there is no "exact match" component. In addition, the logic here used by the text search capabilities with the $text operator means that "Physics-Physics" is the preferred match here.
This is because then engine does not recognize "non words" such as the "hyphen" in between. To it, the word "Physics" appears several times in the indexed content for the document, therefore it has a higher score.
Now the rest of your logic here depends on the application of "exact match" and what you mean by that. If you are looking for "Physics" in the string and "not" surrounded by "hyphens" or other characters then the following does not suit. But you can just match a field "value" that is "exactly" just "Physics":
db.junk.aggregate([
{ "$match": {
"$text": { "$search": "Physics" }
}},
{ "$project": {
"data": 1,
"score": {
"$add": [
{ "$meta": "textScore" },
{ "$cond": [
{ "$eq": [ "$data", "Physics" ] },
10,
0
]}
]
}
}},
{ "$sort": { "score": -1 } }
])
And that will give you a result that both looks at the "textScore" produced by the engine and then applies some math with a logical test. In this case where the "data" is exactly equal to "Physics" then we "weight" the score by an additional factor using $add:
{
"_id": ObjectId("55af83b964876554be823f30"),
"data" : "Physics",
"score" : 11
}
{
"_id" : ObjectId("55af83b964876554be823f33"),
"data" : "Physics-Physics",
"score" : 1.5
}
{
"_id" : ObjectId("55af83b964876554be823f31"),
"data" : "11th-Physics",
"score" : 0.75
}
{
"_id" : ObjectId("55af83b964876554be823f32"),
"data" : "JEE-IIT-Physics",
"score" : 0.6666666666666666
}
That is what the aggregation framework can do for you, by allowing manipulation of the returned data with additional conditions. The end result is passed to the $sort stage ( notice it is reversed in descending order ) to allow that new value to be to sorting key.
But the aggregation framework can really only deal with "exact matches" like this on strings. There is no facility at present to deal with regular expression matches or index positions in strings that return a meaningful value for projection. Not even a logical match. And the $regex operation is only used to "filter" in queries, so not of use here.
So if you were looking for something in a "phrase" thats was a bit more invovled than a "string equals" exact match, then the other option is using mapReduce.
This is another "imperfect" approach as the limitations of the mapReduce command mean that the "textScore" from such a query by the engine is "completely gone". While the actual documents will be selected correctly, the inherrent "ranking data" is not available to the engine. This is a by-product of how MongoDB "projects" the "score" into the document in the first place, and "projection" is not a feature available to mapReduce.
But you can "play with" the strings using JavaScript, as in my "imperfect" sample:
db.junk.mapReduce(
function() {
var _id = this._id,
score = 0;
delete this._id;
score += this.data.indexOf(search);
score += this.data.lastIndexOf(search);
emit({ "score": score, "id": _id }, this);
},
function() {},
{
"out": { "inline": 1 },
"query": { "$text": { "$search": "Physics" } },
"scope": { "search": "Physics" }
}
)
Which gives results like this:
{
"_id" : {
"score" : 0,
"id" : ObjectId("55af83b964876554be823f30")
},
"value" : {
"data" : "Physics"
}
},
{
"_id" : {
"score" : 8,
"id" : ObjectId("55af83b964876554be823f33")
},
"value" : {
"data" : "Physics-Physics"
}
},
{
"_id" : {
"score" : 10,
"id" : ObjectId("55af83b964876554be823f31")
},
"value" : {
"data" : "11th-Physics"
}
},
{
"_id" : {
"score" : 16,
"id" : ObjectId("55af83b964876554be823f32")
},
"value" : {
"data" : "JEE-IIT-Physics"
}
}
My own "silly little algorithm" here is basically taking both the "first" and "last" index position of the matched string here and adding them together to produce a score. It's likely not what you really want, but the point is that if you can code your logic in JavaScript, then you can throw it at the engine to produce the desired "ranking".
The only real "trick" here to remember is that the "score" must be the "preceeding" part of the grouping "key" here, and that if including the orginal document _id value then that composite key part must be renamed, otherwise the _id will take precedence of order.
This is just part of mapReduce where as an "optimization" all output "key" values are sorted in "ascending order" before being processed by the reducer. Which of course does nothing here since we are not "aggregating", but just using the JavaScript runner and document reshaping of mapReduce in general.
So the overall note is, those are the available options. None of them perfect, but you might be able to live with them or even just "accept" the default engine result.
If you want more then look at external "dedicated" text search products, which would be better suited.
Side Note: The $text searches here are preferred over $regex because they can use an index. A "non-anchored" regular expression ( without the caret ^ ) cannot use an index optimally with MongoDB. Therefore the $text searches are generally going to be a better base for finding "words" within a phrase.
One more way is using the $indexOfCp aggregation operator to get the index of matched string and then apply sort on the indexed field
Data insertion
db.junk.insert([
{ "data": "Physics" },
{ "data": "11th-Physics" },
{ "data": "JEE-IIT-Physics" },
{ "data": "Physics-Physics" },
{ "data": "Something Unrelated" }
])
Query
const data = "Physics";
db.junk.aggregate([
{ "$match": { "data": { "$regex": data, "$options": "i" }}},
{ "$addFields": { "score": { "$indexOfCP": [{ "$toLower": "$data" }, { "$toLower": data }]}}},
{ "$sort": { "score": 1 }}
])
Here you can test the output
[
{
"_id": ObjectId("5a934e000102030405000000"),
"data": "Physics",
"score": 0
},
{
"_id": ObjectId("5a934e000102030405000003"),
"data": "Physics-Physics",
"score": 0
},
{
"_id": ObjectId("5a934e000102030405000001"),
"data": "11th-Physics",
"score": 5
},
{
"_id": ObjectId("5a934e000102030405000002"),
"data": "JEE-IIT-Physics",
"score": 8
}
]

Querying ElasticSearch to order empty strings last

I am using Django, Haystack, and ElasticSearch. I want to order my search results so that results where the ordered field value is empty ("") come after results where it is not empty. I cannot find an API in Haystack that can do this. The request sent to ElasticSearch looks like:
{
"sort":[
{
"version":{
"order":"asc"
}
}
],
"query":{
...
}
}
Is there a way to rewrite this ElasticSearch query so that results with an empty string for "version" will come after results where "version" exists?
I have implemented this in Python as:
sorted(sqs, key=lambda x: getattr(x, 'version') == '')
This query assigns _score of 1.0 to all records with non-empty version and _score of 2.0 to all records with empty version. Then it sorts by _score in ascending order and then by version in ascending order. As a result, all records with empty version are pushed to the bottom of the list.
{
"query": {
"custom_filters_score" : {
"query" : {
"constant_score": {
"query": {
.... your original query ....
}
}
},
"filters" : [
{
"filter" : { "missing" : { "field" : "version"} },
"boost" : "2"
}
]
}
},
"sort": [
{
"_score": {"order":"asc"}
},
{
"version": {"order":"asc"}
}
]
}