Combining $regex and $or operators in Mongo - regex

I want to use $or and $regex operators same time.
db.users.insert([{name: "Alice"}, {name: "Bob"}, {name: "Carol"}, {name: "Dan"}, {name: "Dave"}])
Using $regex works fine:
> db.users.find({name: {$regex: "^Da"}})
{ "_id" : ObjectId("53e33682b09f1ca437078b1d"), "name" : "Dan" }
{ "_id" : ObjectId("53e33682b09f1ca437078b1e"), "name" : "Dave" }
When introducing $or, the response is changed. I expected the same response:
> db.users.find({name: {$regex: {$or: ["^Da"]}}})
{ "_id" : ObjectId("53e33682b09f1ca437078b1a"), "name" : "Alice" }
{ "_id" : ObjectId("53e33682b09f1ca437078b1b"), "name" : "Bob" }
{ "_id" : ObjectId("53e33682b09f1ca437078b1c"), "name" : "Carol" }
{ "_id" : ObjectId("53e33682b09f1ca437078b1d"), "name" : "Dan" }
{ "_id" : ObjectId("53e33682b09f1ca437078b1e"), "name" : "Dave" }
I also tried to change the order of the operators:
> db.users.find({name: {$or: [{$regex: "^Da"}, {$regex: "^Ali"}]}})
error: { "$err" : "invalid operator: $or", "code" : 10068 }
However, it seems that following query works fine, but it's a little bit long (name is repeated):
> db.users.find({$or: [{name: {$regex: "^Da"}}, {name: {$regex: "^Ali"}}]})
{ "_id" : ObjectId("53e33682b09f1ca437078b1a"), "name" : "Alice" }
{ "_id" : ObjectId("53e33682b09f1ca437078b1d"), "name" : "Dan" }
{ "_id" : ObjectId("53e33682b09f1ca437078b1e"), "name" : "Dave" }
Is there any shorter way to use $regex and $or in queries like this?
The goal is to use $regex operator and not /.../ (real regular expressions).

The $or operator expects whole conditions so the correct form would be:
db.users.find({ "$or": [
{ "name": { "$regex": "^Da"} },
{ "name": { "$regex": "^Ali" }}
]})
Or of course using $in:
db.users.find({ "name": { "$in": [/^Da/,/^Ali/] } })
But it's a regex so you can do:
db.users.find({ "name": { "$regex": "^Da|^Ali" } })

It is been a while. However, I would add case insensitive to the regex query like the query below. So that, it doesn't matter if names were saved into the database with capital letters:
db.users.find({ "name": { "$regex": "^Da|^Ali", "$options": "i" } })
Hope it helps

It seems when you have $and or $or and multiple search based and used at least one $regex you have to use $regex for all conditions.
First from below works ok, second more like $or operator.
db.big_data.users.find(
{ $and: [
{ sex: { $regex: /^M.*/ } },
{ name: { $regex: /^J.*/ } }
] })
db.big_data.users.find({ $and: [ {sex: "M"}, { name: { $regex: /^J*/m } } ] })

you can use OR operator like
db.collName.find({ "name": { "$regex": "^Da|^Ali" ,"$options": "i" } })
and operator
db.collName.find({ "name": { "$regex": "Ali" ,"$options": "i" } })
for more info
source - https://www.cs.jhu.edu/~jason/405/lectures1-2/sld049.htm

Related

MongoDB Regex Search on Integer Value in nested object [duplicate]

I want to regex search an integer value in MongoDB. Is this possible?
I'm building a CRUD type interface that allows * for wildcards on the various fields. I'm trying to keep the UI consistent for a few fields that are integers.
Consider:
> db.seDemo.insert({ "example" : 1234 });
> db.seDemo.find({ "example" : 1234 });
{ "_id" : ObjectId("4bfc2bfea2004adae015220a"), "example" : 1234 }
> db.seDemo.find({ "example" : /^123.*/ });
>
As you can see, I insert an object and I'm able to find it by the value. If I try a simple regex, I can't actually find the object.
Thanks!
If you are wanting to do a pattern match on numbers, the way to do it in mongo is use the $where expression and pass in a pattern match.
> db.test.find({ $where: "/^123.*/.test(this.example)" })
{ "_id" : ObjectId("4bfc3187fec861325f34b132"), "example" : 1234 }
I am not a big fan of using the $where query operator because of the way it evaluates the query expression, it doesn't use indexes and the security risk if the query uses user input data.
Starting from MongoDB 4.2 you can use the $regexMatch|$regexFind|$regexFindAll available in MongoDB 4.1.9+ and the $expr to do this.
let regex = /123/;
$regexMatch and $regexFind
db.col.find({
"$expr": {
"$regexMatch": {
"input": {"$toString": "$name"},
"regex": /123/
}
}
})
$regexFinAll
db.col.find({
"$expr": {
"$gt": [
{
"$size": {
"$regexFindAll": {
"input": {"$toString": "$name"},
"regex": "123"
}
}
},
0
]
}
})
From MongoDB 4.0 you can use the $toString operator which is a wrapper around the $convert operator to stringify integers.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toString": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])
If what you want is retrieve all the document which contain a particular substring, starting from release 3.4, you can use the $redact operator which allows a $conditional logic processing.$indexOfCP.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toLower": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])
which produces:
{
"_id" : ObjectId("579c668c1c52188b56a235b7"),
"example" : 1234
}
{
"_id" : ObjectId("579c66971c52188b56a235b9"),
"example" : 12334
}
Prior to MongoDB 3.4, you need to $project your document and add another computed field which is the string value of your number.
The $toLower and his sibling $toUpper operators respectively convert a string to lowercase and uppercase but they have a little unknown feature which is that they can be used to convert an integer to string.
The $match operator returns all those documents that match your pattern using the $regex operator.
db.seDemo.aggregate(
[
{ "$project": {
"stringifyExample": { "$toLower": "$example" },
"example": 1
}},
{ "$match": { "stringifyExample": /^123.*/ } }
]
)
which yields:
{
"_id" : ObjectId("579c668c1c52188b56a235b7"),
"example" : 1234,
"stringifyExample" : "1234"
}
{
"_id" : ObjectId("579c66971c52188b56a235b9"),
"example" : 12334,
"stringifyExample" : "12334"
}
Now, if what you want is retrieve all the document which contain a particular substring, the easier and better way to do this is in the upcoming release of MongoDB (as of this writing) using the $redact operator which allows a $conditional logic processing.$indexOfCP.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toLower": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])

want to search by integer value in mongoDB regex [duplicate]

I want to regex search an integer value in MongoDB. Is this possible?
I'm building a CRUD type interface that allows * for wildcards on the various fields. I'm trying to keep the UI consistent for a few fields that are integers.
Consider:
> db.seDemo.insert({ "example" : 1234 });
> db.seDemo.find({ "example" : 1234 });
{ "_id" : ObjectId("4bfc2bfea2004adae015220a"), "example" : 1234 }
> db.seDemo.find({ "example" : /^123.*/ });
>
As you can see, I insert an object and I'm able to find it by the value. If I try a simple regex, I can't actually find the object.
Thanks!
If you are wanting to do a pattern match on numbers, the way to do it in mongo is use the $where expression and pass in a pattern match.
> db.test.find({ $where: "/^123.*/.test(this.example)" })
{ "_id" : ObjectId("4bfc3187fec861325f34b132"), "example" : 1234 }
I am not a big fan of using the $where query operator because of the way it evaluates the query expression, it doesn't use indexes and the security risk if the query uses user input data.
Starting from MongoDB 4.2 you can use the $regexMatch|$regexFind|$regexFindAll available in MongoDB 4.1.9+ and the $expr to do this.
let regex = /123/;
$regexMatch and $regexFind
db.col.find({
"$expr": {
"$regexMatch": {
"input": {"$toString": "$name"},
"regex": /123/
}
}
})
$regexFinAll
db.col.find({
"$expr": {
"$gt": [
{
"$size": {
"$regexFindAll": {
"input": {"$toString": "$name"},
"regex": "123"
}
}
},
0
]
}
})
From MongoDB 4.0 you can use the $toString operator which is a wrapper around the $convert operator to stringify integers.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toString": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])
If what you want is retrieve all the document which contain a particular substring, starting from release 3.4, you can use the $redact operator which allows a $conditional logic processing.$indexOfCP.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toLower": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])
which produces:
{
"_id" : ObjectId("579c668c1c52188b56a235b7"),
"example" : 1234
}
{
"_id" : ObjectId("579c66971c52188b56a235b9"),
"example" : 12334
}
Prior to MongoDB 3.4, you need to $project your document and add another computed field which is the string value of your number.
The $toLower and his sibling $toUpper operators respectively convert a string to lowercase and uppercase but they have a little unknown feature which is that they can be used to convert an integer to string.
The $match operator returns all those documents that match your pattern using the $regex operator.
db.seDemo.aggregate(
[
{ "$project": {
"stringifyExample": { "$toLower": "$example" },
"example": 1
}},
{ "$match": { "stringifyExample": /^123.*/ } }
]
)
which yields:
{
"_id" : ObjectId("579c668c1c52188b56a235b7"),
"example" : 1234,
"stringifyExample" : "1234"
}
{
"_id" : ObjectId("579c66971c52188b56a235b9"),
"example" : 12334,
"stringifyExample" : "12334"
}
Now, if what you want is retrieve all the document which contain a particular substring, the easier and better way to do this is in the upcoming release of MongoDB (as of this writing) using the $redact operator which allows a $conditional logic processing.$indexOfCP.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toLower": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])

Elasticsearch not working with 'not_analyzed' index

I am unable to figure out why elasticsearch not searching with not_analysed indexes. I have following settings in my model,
settings index: { number_of_shards: 1 } do
mappings dynamic: 'false' do
indexes :id
indexes :name, index: 'not_analyzed'
indexes :email, index: 'not_analyzed'
indexes :contact_number
end
end
def as_indexed_json(options = {})
as_json(only: [ :id, :name, :username, :user_type, :is_verified, :email, :contact_number ])
end
And my mapping at elasticsearch is right, as below.
{
"users-development" : {
"mappings" : {
"user" : {
"dynamic" : "false",
"properties" : {
"contact_number" : {
"type" : "string"
},
"email" : {
"type" : "string",
"index" : "not_analyzed"
},
"id" : {
"type" : "string"
},
"name" : {
"type" : "string",
"index" : "not_analyzed"
}
}
}
}
}
}
But issue is when I make search on not analyzed fields (name and email, as I wanted them to be not analyzed) it only search on full word. Like in the example below it should have return John, Johny and Tiger, all 3 records. But it only returns 2 of the records.
I am searching as below
settings = {
query: {
filtered: {
filter: {
bool: {
must: [
{ terms: { name: [ "john", "tiger" ] } },
]
}
}
}
},
size: 10
}
User.__elasticsearch__.search(settings).records
This is how I am creating index on my user object in callback after_save,
User.__elasticsearch__.client.indices.create(
index: User.index_name,
id: self.id,
body: self.as_indexed_json,
)
Some of the document that should match
[{
"_index" : "users-development",
"_type" : "user",
"_id" : "670",
"_score" : 1.0,
"_source":{"id":670,"email":"john#monkeyofdoom.com","name":"john baba","contact_number":null}
},
{
"_index" : "users-development",
"_type" : "user",
"_id" : "671",
"_score" : 1.0,
"_source":{"id":671,"email":"human#monkeyofdoom.com","name":"Johny Rocket","contact_number":null}
}
, {
"_index" : "users-development",
"_type" : "user",
"_id" : "736",
"_score" : 1.0,
"_source":{"id":736,"email":"tiger#monkeyofdoom.com","name":"tiger sherof", "contact_number":null}
} ]
Any suggestions please.
I think you would get desired results with keyword toknizer combined with lowercase filter rather than using not_analyzed.
The reason john* did not match Johny was due to case sensitivity.
This setup will work
{
"settings": {
"analysis": {
"analyzer": {
"keyword_analyzer": {
"type": "custom",
"filter": [
"lowercase"
],
"tokenizer": "keyword"
}
}
}
},
"mappings": {
"my_type": {
"properties": {
"name": {
"type": "string",
"analyzer": "keyword_analyzer"
}
}
}
}
}
Now john* will match johny. You should be using multi-fields if you have various requirements. terms query for john wont give you john baba as inside inverted index there is no token as john. You could use standard analyzer on one field and keyword analyzer on other.
As per the documentation term query
The term query finds documents that contain the exact term specified in the inverted index.
You are searching for john but none of your documnents contain john i.e why you were not getting any result. Either you can your field analysed and then apply query string or search for exact term.
Refer https://www.elastic.co/guide/en/elasticsearch/reference/2.x/query-dsl-term-query.html for more details

Regex inside array in mongoDB

i want to do a query inside a array in mongodb with regex, the collections have documents like this:
{
"_id" : ObjectId("53340d07d6429d27e1284c77"),
"company" : "New Company",
"worktypes" : [
{
"name" : "Pompas",
"works" : [
{
"name" : "name 2",
"code" : "A00011",
"price" : "22,22"
},
{
"name" : "name 3",
"code" : "A00011",
"price" : "22,22"
},
{
"name" : "name 4",
"code" : "A00011",
"price" : "22,22"
},
{
"code" : "asdasd",
"name" : "asdads",
"price" : "22"
},
{
"code" : "yy",
"name" : "yy",
"price" : "11"
}
]
},
{
"name" : "name 4",
"works" : [
{
"code" : "A112",
"name" : "Nombre",
"price" : "11,2"
}
]
},
{
"name" : "ee",
works":[
{
"code" : "aa",
"name" : "aa",
"price" : "11"
},
{
"code" : "A00112",
"name" : "Nombre",
"price" : "12,22"
}
]
}
]
}
Then i need to find a document by the company name and any work inside it have match a regex in code or name work.
I have this:
var companyquery = { "company": "New Company"};
var regQuery = new RegExp('^A0011.*$', 'i');
db.categories.find({$and: [companyquery,
{$or: [
{"worktypes.works.$.name": regQuery},
{"worktypes.works.$.code": regQuery}
]}]})
But dont return any result..I think the error is try to search inside array with de dot and $..
Any idea?
Edit:
With this:
db.categories.find({$and: [{"company":"New Company"},
{$or: [
{"worktypes.works.name": {"$regex": "^A00011$|^a00011$"}},
{"worktypes.works.code": {"$regex": "^A00011$|^a00011$"}}
]}]})
This is the result:
{
"_id" : ObjectId("53340d07d6429d27e1284c77"),
"company" : "New Company",
"worktypes" : [
{
"name" : "Pompas",
"works" : [
{
"name" : "name 2",
"code" : "A00011",
"price" : "22,22"
},
{
"code" : "aa",
"name" : "aa",
"price" : "11"
},
{
"code" : "A00112",
"name" : "Nombre",
"price" : "12,22"
},
{
"code" : "asdasd",
"name" : "asdads",
"price" : "22"
},
{
"code" : "yy",
"name" : "yy",
"price" : "11"
}
]
},
{
"name" : "name 4",
"works" : [
{
"code" : "A112",
"name" : "Nombre",
"price" : "11,2"
}
]
},
{
"name" : "Bombillos"
},
{
"name" : "Pompas"
},
{
"name" : "Bombillos 2"
},
{
"name" : "Other type"
},
{
"name" : "Other new type"
}
]
}
The regex dont field the results ok..
You are using a JavaScript native RegExp object for the regular expression, however for mongo to process the regular expression it needs to be sent as part of the query document, and this is not the same thing.
Also the regex will not match the values that you want. It could actualy be ^A0111$ for the exact match, but your case insensitive match causes a problem causing a larger scan of a possible index. So there is a better way to write that. Also see the documentation link for the problems with case insensitive matches.
Use the $regex operator instead:
db.categories.find({
"$and": [
{"company":"New Company"},
{ "$or": [
{ "worktypes.works.name": { "$regex": "^A00011$|^a00011$" }},
{ "worktypes.works.code": { "$regex": "^A00011$|^a00011$" }}
]}
]
})
Also the positional $ placeholders are not valid for a query, they are only used in projection or an update or the first matching element found by the query.
But your actual problem seems to be that you are trying to only get the elements of an array that "match" your conditions. You cannot do this with .find() and for that you need to use .aggregate() instead:
db.categories.aggregate([
// Always makes sense to match the actual documents
{ "$match": {
"$and": [
{"company":"New Company"},
{ "$or": [
{ "worktypes.works.name": { "$regex": "^A00011$|^a00011$" }},
{ "worktypes.works.code": { "$regex": "^A00011$|^a00011$" }}
]}
]
}},
// Unwind the worktypes array
{ "$unwind": "$worktypes" },
// Unwind the works array
{ "$unwind": "$worktypes.works" },
// Then use match to filter only the matching entries
{ "$match": {
"$or": [
{ "worktypes.works.name": { "$regex": "^A00011$|^a00011$" } },
{ "worktypes.works.code": { "$regex": "^A00011$|^a00011$" } }
]
}},
/* Stop */
// If you "really" need the arrays back then include all the following
// Otherwise the steps up to here actually got you your results
// First put the "works" array back together
{ "$group": {
"_id": {
"_id": "$_id",
"company": "$company",
"workname": "$worktypes.name"
},
"works": { "$push": "$worktypes.works" }
}},
// Then put the "worktypes" array back
{ "$group": {
"_id": "$_id._id",
"company": { "$first": "$_id.company" },
"worktypes": {
"$push": {
"name": "$_id.workname",
"works": "$works"
}
}
}}
])
So what .aggregate() does with all of these stages is it breaks the array elements into normal document form so they can be filtered using the $match operator. In that way, only the elements that "match" are returned.
What "find" is correctly doing is matching the "document" that meets the conditions. Since documents contain the elements that match then they are returned. The two principles are very different things.
When you mean to "filter" use aggregate.
i think there is a typo :
the regex should be : ^A00011.*$
triple 0 instead of double 0
You can try aggregate method and aggregation array operators, so this query will be supported from MongoDB 4.2,
$match to match your condition
$addFields to add/edit field in document
$map to iterate loop of worktypes array
$filter to iterate loop of works array and it will return the filtered result as per provided condition
$regexMatch to match regex expression same as we did in $match stage, it will return a boolean response, so we checked $or condition here,
$mergeObjects to merge current object of worktypes and updated works array property
second $addFields for remove empty result of works array
$filter to iterate loop of worktypes array and check negative condition to remove empty works document
db.categories.aggregate([
{
$match: {
$and: [
{ "company": "New Company" },
{
$or: [
{ "worktypes.works.name": { "$regex": "^A00011$|^a00011$" } },
{ "worktypes.works.code": { "$regex": "^A00011$|^a00011$" } }
]
}
]
}
},
{
$addFields: {
worktypes: {
$map: {
input: "$worktypes",
in: {
$mergeObjects: [
"$$this",
{
works: {
$filter: {
input: "$$this.works",
cond: {
$or: [
{
$regexMatch: {
input: "$$this.name",
regex: "^A00011$|^a00011$"
}
},
{
$regexMatch: {
input: "$$this.code",
regex: "^A00011$|^a00011$"
}
}
]
}
}
}
}
]
}
}
}
}
},
{
$addFields: {
worktypes: {
$filter: {
input: "$worktypes",
cond: { $ne: ["$$this.works", []] }
}
}
}
}
])
Playground

MongoDB Regex Search on Integer Value

I want to regex search an integer value in MongoDB. Is this possible?
I'm building a CRUD type interface that allows * for wildcards on the various fields. I'm trying to keep the UI consistent for a few fields that are integers.
Consider:
> db.seDemo.insert({ "example" : 1234 });
> db.seDemo.find({ "example" : 1234 });
{ "_id" : ObjectId("4bfc2bfea2004adae015220a"), "example" : 1234 }
> db.seDemo.find({ "example" : /^123.*/ });
>
As you can see, I insert an object and I'm able to find it by the value. If I try a simple regex, I can't actually find the object.
Thanks!
If you are wanting to do a pattern match on numbers, the way to do it in mongo is use the $where expression and pass in a pattern match.
> db.test.find({ $where: "/^123.*/.test(this.example)" })
{ "_id" : ObjectId("4bfc3187fec861325f34b132"), "example" : 1234 }
I am not a big fan of using the $where query operator because of the way it evaluates the query expression, it doesn't use indexes and the security risk if the query uses user input data.
Starting from MongoDB 4.2 you can use the $regexMatch|$regexFind|$regexFindAll available in MongoDB 4.1.9+ and the $expr to do this.
let regex = /123/;
$regexMatch and $regexFind
db.col.find({
"$expr": {
"$regexMatch": {
"input": {"$toString": "$name"},
"regex": /123/
}
}
})
$regexFinAll
db.col.find({
"$expr": {
"$gt": [
{
"$size": {
"$regexFindAll": {
"input": {"$toString": "$name"},
"regex": "123"
}
}
},
0
]
}
})
From MongoDB 4.0 you can use the $toString operator which is a wrapper around the $convert operator to stringify integers.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toString": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])
If what you want is retrieve all the document which contain a particular substring, starting from release 3.4, you can use the $redact operator which allows a $conditional logic processing.$indexOfCP.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toLower": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])
which produces:
{
"_id" : ObjectId("579c668c1c52188b56a235b7"),
"example" : 1234
}
{
"_id" : ObjectId("579c66971c52188b56a235b9"),
"example" : 12334
}
Prior to MongoDB 3.4, you need to $project your document and add another computed field which is the string value of your number.
The $toLower and his sibling $toUpper operators respectively convert a string to lowercase and uppercase but they have a little unknown feature which is that they can be used to convert an integer to string.
The $match operator returns all those documents that match your pattern using the $regex operator.
db.seDemo.aggregate(
[
{ "$project": {
"stringifyExample": { "$toLower": "$example" },
"example": 1
}},
{ "$match": { "stringifyExample": /^123.*/ } }
]
)
which yields:
{
"_id" : ObjectId("579c668c1c52188b56a235b7"),
"example" : 1234,
"stringifyExample" : "1234"
}
{
"_id" : ObjectId("579c66971c52188b56a235b9"),
"example" : 12334,
"stringifyExample" : "12334"
}
Now, if what you want is retrieve all the document which contain a particular substring, the easier and better way to do this is in the upcoming release of MongoDB (as of this writing) using the $redact operator which allows a $conditional logic processing.$indexOfCP.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toLower": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])