MongoDB Aggregation regex match object id - regex

I have a collection;
"users": [
{
"_id": ObjectId("5c4185be19da7e815cb18f59"),
"name": "User1"
},
{
"_id": ObjectId("5c4185be19da7e815cb18f5a"),
"name": "User2"
} ]
I need to search users collection by regex.
db.results.aggregate([{
"$match": {
"name": {
"$regex": "user",
"$options": "si"
}
}
}
])
this works for searching against user field. I tried with the below code to search against id. But it didn't work for me.
db.results.aggregate([{
"$match": {
"_id": {
"$regex": "18f5a",
"$options": "si"
}
}
}
])
Thanks in advance.

The _id field is ObjectId type by default hence you can't regex match it.
If you're using Mongo version 4.0+ you can use toString.
db.results.aggregate([
{
$addFields: {
_id: {$toString: "$_id"}
}
},
{
"$match": {
"_id": {
"$regex": "18f5a",
"$options": "si"
}
}
}
])

Related

How to apply custom score to a search filed in Elastic Search

I am making a search query in Elastic Search and I want to treat the fields the same when they match. For example if I search for field field1 and it matches, then the _score is increase by 10(for example), same for the field2.
I was tried function_score but it's not working. It throws an error.
"caused_by": {
"type": "class_cast_exception",
"reason": "class
org.elasticsearch.index.fielddata.plain.SortedSetDVOrdinalsIndexFieldData
cannot be cast to class
org.elasticsearch.index.fielddata.IndexNumericFieldData
(org.elasticsearch.index.fielddata.plain.SortedSetDVOrdinalsIndexFieldData
and org.elasticsearch.index.fielddata.IndexNumericFieldData are in unnamed
module of loader 'app')"
}
The query:
{
"track_total_hits": true,
"size": 50,
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"term": {
"field1": {
"value": "Value 1"
}
}
},
{
"term": {
"field2": {
"value": "value 2"
}
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "field1",
"factor": 10,
"missing": 0
}
},
{
"field_value_factor": {
"field": "field2",
"factor": 10,
"missing": 0
}
}
],
"boost_mode": "multiply"
}
}
}
You can use function score with filter function to boost.
assuming that your mapping looks like the one below
{
"mappings": {
"properties": {
"field_1": {
"type": "keyword"
},
"field_2": {
"type": "keyword"
}
}
}
}
with documents
{"index":{}}
{"field_1": "foo", "field_2": "bar"}
{"index":{}}
{"field_1": "foo", "field_2": "foo"}
{"index":{}}
{"field_1": "bar", "field_2": "bar"}
you can use weight parameter to boost the documents matched for each query.
{
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [
{
"filter": {
"term": {
"field_1": "foo"
}
},
"weight": 10
},
{
"filter": {
"term": {
"field_2": "foo"
}
},
"weight": 20
}
],
"score_mode": "multiply"
}
}
}
You can refer below solution if you want to provide manual weight for different field in query. This will always replace highest weight field on top of your query response -
Elasticsearch query different fields with different weight

Elasticsearch wildcard, regexp, match_phrase, prefix query returning wrong results

I have just started using Elasticsearch, version 7.5.1.
I want to query results which start with a particular word fragment.
For example tho* should return data containing:
thought, Thomson, those, etc.
I tried with -
Regexp
[{'regexp':{'f1':'tho.*'}},{'regexp':{'f2':'tho.*'}}]
Wildcard
[{'wildcard':{'f1':'tho*'}},{'wildcard':{'f2':'tho*'}}]
Prefix
[{'prefix':{'f1':'tho'}},{'prefix':{'f2':'tho'}}]
match_phrase
'multi_match': {'query': 'tho', 'fields':[f1,f2,f3], 'type':phrase}
# also tried with type phrase_prefix
All those are returning correct results, but they all also return the word method.
Similarly cat* is returning the word communication.
What I am doing wrong? Is this something related to analyzer?
Edit -
Here is the field mapping -
'f1': {
'full_name': 'f1',
'mapping': {
'f1': {
'type': 'text',
'analyzer': 'some_analyzer',
'index_phrases': true
}
}
},
Since you have not provided any index mapping of yours and as mentioned you are getting method also in the search result. I think that there is some issue with the analyzer that you have set.
One possibility is that you have set ngram tokenizer, that tokenizes the words, and produce token of tho (since all the words have tho present in them)
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"f1": {
"type": "text"
}
}
}
}
Index Data:
{
"f1": "method"
}
{
"f1": "thought"
}
{
"f1": "Thomson"
}
{
"f1": "those"
}
Search Query using Wildcard Query:
{
"query": {
"wildcard": {
"f1": {
"value": "tho*"
}
}
}
}
Search Query using Prefix Query:
{
"query": {
"prefix": {
"f1": {
"value": "tho"
}
}
}
}
Search Query using Regexp query:
{
"query": {
"regexp": {
"f1": {
"value": "tho.*"
}
}
}
}
Search QUery using match phrase prefix query:
{
"query": {
"match_phrase_prefix": {
"f1": {
"query": "tho"
}
}
}
}
Search Result for all the above 4 queries are
"hits": [
{
"_index": "67673694",
"_type": "_doc",
"_id": "1",
"_score": 1.2039728,
"_source": {
"f1": "thought"
}
},
{
"_index": "67673694",
"_type": "_doc",
"_id": "2",
"_score": 1.2039728,
"_source": {
"f1": "Thomson"
}
},
{
"_index": "67673694",
"_type": "_doc",
"_id": "3",
"_score": 1.2039728,
"_source": {
"f1": "those"
}
}
]

Use $regex inside $expr in mongodb aggregation

My doc looks as follows
doc = {
name: 'abc',
age:20
}
and my query looks like
{ $expr: {$and:[{ $gt:[ "$age", 10 ] },
{ $regex:["$name",'ab']}
]
}
} }
But it's not working and I get an error
Unrecognized expression '$regex'
How can I make it work?
My original query looks like this
db.orders.aggregate([{
$match: {}},
{$lookup: {
from: "orders",
let: {
"customer_details": "$customerDetails"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{ $or: [
{
$eq: ["$customerDetails.parentMobile","$$customer_details.parentMobile"]
},
{$eq: ["$customerDetails.studentMobile","$$customer_details.parentMobile"]
},
{$eq: ["$customerDetails.studentMobile","$$customer_details.parentMobile"]
},
{$eq: ["$customerDetails.studentMobile","$$customer_details.studentMobile"]
}
]
},
{$eq: ["$customerDetails.zipCode","$$customer_details.zipCode"]},
{$eq: ["$customerDetails.address","$$customer_details.address"]}
]
}
}
}],
as: "oldOrder"
}
}])
I want to use regex for matching address.
Any help will be greatly appreciated. Thanks in advance.
If your mongoDB version is 4.2, then you can use $regexMatch
try this
db.collection.find({
$expr: {
$and: [
{
$gt: [
"$age",
10
]
},
{
$regexMatch: {
input: "$name",
regex: "ab"
}
}
]
}
})
check this Mongo Playground
$regex is a query operator you cannot use inside $expr because it only supports aggregation pipeline operators.
{
"$expr": { "$gt": ["$age", 10] } ,
"name": { "$regex": "ab" }
}
If you have mongodb 4.2, you can use $regexMatch
{ "$expr": {
"$and": [
{ "$gt": ["$age", 10] },
{
"$regexMatch": {
"input": "$name",
"regex": "ab", //Your text search here
"options": "i",
}
}
]
}}

Regexp vs Include performance comparison in Elasticsearch

I work on a project and I need to aggregate the results based on "created" and "labels" field.
I created following queries that both give the result as I expected. But I want to learn that which query runs more fast?
My first query:
{
"size": 0,
"aggs": {
"HEATMAP": {
"date_histogram": {
"field": "created",
"interval": "day"
},
"aggs": {
"BEHAVIOUR_CHANGE": {
"terms": {
"field": "labels",
"include": "behavior-change"
}
},
"FIRST_OCCURRENCE": {
"terms": {
"field": "labels",
"include": "first-occurrence"
}
}
}
}
}
}
My second query:
{
"size": 0,
"aggs": {
"HEATMAP": {
"date_histogram": {
"field": "created",
"interval": "day"
},
"aggs": {
"BEHAVIOUR_CHANGE": {
"filter": {
"regexp": {
"labels": "behavior-change"
}
}
},
"FIRST_OCCURRENCE": {
"filter": {
"regexp": {
"labels": "first-occurrence"
}
}
}
}
}
}
}
Since that field is a keyword and you don't need anything special when it comes to a regular expression (only a perfect match), I would do it like the following. You'd note also that I added a terms filter to the query part to try and narrow down the results before being put through the aggregations (theoretically, for the aggregations to have less work to do). Also, I don't see a reason to use regexp here, thus I used the terms aggregations. If you are really interested in the performance comparison, I'd suggest setting up a load test with many more documents and terms in that field and perform some tests. Elastic has its own benchmarking tool that you could use for this: Rally.
{
"size": 0,
"query": {
"terms": {
"labels": [
"behavior-change",
"first-occurrence"
]
}
},
"aggs": {
"HEATMAP": {
"date_histogram": {
"field": "created",
"interval": "day"
},
"aggs": {
"BEHAVIOUR_CHANGE": {
"terms": {
"field": "labels",
"include": "behavior-change"
}
},
"FIRST_OCCURRENCE": {
"terms": {
"field": "labels",
"include": "first-occurrence"
}
}
}
}
}
}

CouchDB exclude from view based on list of regex expressions

Whats the best approach for excluding documents from a view based on a list of regex expressions. For example I want to exclude anything where doc.issue.name contains a value that matches a list of regex expressions.
e.g. exclusion list: [/foo/, /bar/]
{
"_id": "1",
"issue": {
"name": "foo"
}
{
"_id": "2",
"issue": {
"name": "bar"
}
{
"_id": "3",
"issue": {
"name": "fred"
}
So based on the documents above, just return the document where doc.issue.name = "fred"
OK so to answer my own question here in case anybody else needs to do this type of thing!
Based on the following documents:
{
"_id": "1",
"issue": {
"name": "foo"
}
{
"_id": "2",
"issue": {
"name": "bar"
}
{
"_id": "3",
"issue": {
"name": "fred"
}
This map function:
function(doc) {
var reg_exps = [/foo/g, /bar/g];
for (r in reg_exps){
if (doc.name.match(reg_exps[r])){
return;
}
}
emit(doc.name, 1);
}
Will only return the document with the name of "fred"