MongoDB case insensitive query on text with parenthesis - regex

I have a very annoying problem with a case insensitive query on mongodb.
I'm using MongoTemplate in a web application and I need to execute case insensitive queries on a collection.
with this code
Query q = new Query();
q.addCriteria(Criteria.where("myField")
.regex(Pattern.compile(fieldValue, Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE)));
return mongoTemplate.findOne(q,MyClass.class);
I create the following query
{ "myField" : { "$regex" : "field value" , "$options" : "iu"}}
that works perfectly when I have simple text, for example:
caPITella CapitatA
but...but...when there are parenthesis () the query doesn't work.
It doesn't work at all, even the query text is wrote as is wrote in the document...Example:
query 1:
{"myField" : "Ceratonereis (Composetia) costae" } -> 1 result (ok)
query 2:
{ "myField" : {
"$regex" : "Ceratonereis (Composetia) costae" ,
"$options" : "iu"
}} -> no results (not ok)
query 3:
{ "scientificName" : {
"$regex" : "ceratonereis (composetia) costae" ,
"$options" : "iu"
}} -> no results (....)
So...I'm doing something wrong? I forgot some Pattern.SOME to include in the Pattern.compile()? Any solution?
Thanks
------ UPDATE ------
The answer of user3561036 helped me to figure how the query must be built.
So, I have resolved by modifying the query building in
q.addCriteria(Criteria.where("myField")
.regex(Pattern.compile(Pattern.quote(myFieldValue), Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE)));
The output query
{ "myField" : { "$regex" : "\\Qhaliclona (rhizoniera) sarai\\E" , "$options" : "iu"}}
works.

If using the $regex operator with a "string" as input then you must quote literals for reserved characters such as ().
Normally that's a single \, but since it's in a string already you do it twice \\:
{ "myField" : {
"$regex" : "Ceratonereis \\(Composetia\\) costae" ,
"$options" : "iu"
}}

It's an old question, but you can use query.replace(/[-[\]{}()*+?.,\\/^$|#\s]/g, "\\$&");
This is working with aggregate and matches :
const order = user_input.replace(/[-[\]{}()*+?.,\\/^$|#\s]/g, "\\$&");
const regex = new RegExp(order, 'i');
const query = await this.databaseModel.aggregate([
{
$match: {
name : regex
}
// ....

Use $strcasecmp.
The aggregation framework was introduced in MongoDB 2.2. You can use the string operator "$strcasecmp" to make a case-insensitive comparison between strings.
It's more recommended and easier than using regex.

Related

How to query with conditionals in MongoDB

I am new to MongoDB and am learning how to query for multiple things at once with conditionals.
I have a database with a document called 'towns' that contains an id, name, population, date of last census, items it is famous for, and mayor. For example, this is what one of the towns looks like (please keep in mind, this is old, random data, nothing is up to date, it is just an example for me to learn):
{
"_id" : ObjectId("60232b0bbae1e5336c5ebc96"),
"name" : "New York",
"population" : 22200000,
"lastCensus" : ISODate("2016-07-05T00:00:00Z"),
"famousFor" : [
"the MOMA",
"food"
],
"mayor" : {
"name" : "Bill de Blasio",
"party" : "D"
}
I am trying to find all towns with names that contain an e and that are famous for food or beer.
I currently have this query:
db.towns.find({name: {$regex:"e"}}, {$or: [{famousFor:{$regex: 'food'}}, {famousFor:{$regex: 'beer'}}]})
If I split up the name and the $or expression, it works, but together I get errors like:
Error: error: {
"ok" : 0,
"errmsg" : "Unrecognized expression '$regex'",
"code" : 168,
"codeName" : "InvalidPipelineOperator"
Or, if I switch the query to db.towns.find({name:/e/}, {$or: [{famousFor:/food/}, {famousFor:/beer/}]}) I get the error:
Error: error: {
"ok" : 0,
"errmsg" : "FieldPath field names may not start with '$'.",
"code" : 16410,
"codeName" : "Location16410"
What am I doing wrong? Is it how I am structuring the query?
Thanks in advance!
Problem Is the syntax.
find({condition goes here}, {projection goes here})
You need to put all of your conditions within one curly brace.
db.towns.find({name: {$regex:"e"}, $or: [{famousFor:{$regex: 'food'}}, {famousFor:{$regex: 'beer'}}]})

How to find all values in word by using regexp in MongoDB?

Let's say I have the following string in MongoDB document:
{"name": "space delimited string"}
I need to build mongodb query with regexp to find this document by entering the following search request:
space string
It look like LIKE operator in RDBS. I know that there is latest MongoDB 3 with full-text search but I need regexp due current outdated version.
Please help me to construct mongodb query with regexp to find document by entering the search above.
Thanks
As I see it there are a couple of options.
If you mean "AND" for all words then use positive lookahead:
{ "name": /(?=.*\bspace\b)(?=.*\bstring\b).+/ }
or if an $all operator suits you better:
{ "name": { "$all": [/\bspace\b/,/\bstrig\b/] } }
And if you mean "OR" for either of the words then you can do:
{ "name": /\bspace\b|\bstring\b/ }
or use an $in operator:
{ "name": { "$in": [/\bspace\b/,/\bstring\b/] } }
Noting that in all cases you likely want those \b boundary matches in there to delimit the "word", or otherwise you are getting "partial" words.
So it depends on which you mean and which suits you best. You can construct the regular expression using its own syntaxt to either mean "AND" or "OR", or alternately you can just use the equivalent MongoDB logical expresions ( $all or $in ) that take a "list" of regular expressions instead.
So build a string for regex or build a list. Your choice.
Naturally of course you need to "break up" a string into the "words" in order to process. Lacking an a language tag here, but as a JavaScript example:
As a single regular expression for "AND":
var searchString = "space string";
var expression = new RegExp(
"" + searchString.split(" ").map(function(word) {
return "(?=.*\\b" + word + "\\b)"
}).join("") + ".+"
)
var query = { "name": expression };
Or for an "OR" condition on a single expression:
var expression = new RegExp(
searchString.split(" ").map(function(word) {
return "\\b" + word + "\\b"
}).join("|")
);
var query = { "name": expression };
Or as a list of expressions:
var type = "AND",
query = { "name": {} };
// List of expressions
var list = searchString.split(" ").map(function(word) {
return new RegExp("\\b" + word + "\\b")
});
// Determine operator based on type
query.name[( type === "AND") ? "$all" : "$in"] = list;

find str in another str with regex

I defined:
var s1="roi john";
var s2="hello guys my name is roi levi or maybe roy";
i need to split the words in s1 and check if they contains in s2
if yes give me the specific exists posts
The best way to help me with this, it is makes it as regex, cause i need this checks for mongo db.
Please let me know the proper regex i need.
Thx.
Possibly was something that could be answered with just the regular expression (and is actually) but considering the data:
{ "phrase" : "hello guys my name is roi levi or maybe roy" }
{ "phrase" : "and another sentence from john" }
{ "phrase" : "something about androi" }
{ "phrase" : "johnathan was here" }
You match with MongoDB like this:
db.collection.find({ "phrase": /\broi\b|\bjohn\b/ })
And that gets the two documents that match:
{ "phrase" : "hello guys my name is roi levi or maybe roy" }
{ "phrase" : "and another sentence from john" }
So the regex works by keeping the word boundaries \b around the words to match so they do not partially match something else and are combined with an "or" | condition.
Play with the regexer for this.
Doing open ended $regex queries like this in MongoDB can be often bad for performance. Not sure of your actual use case for this but it is possible that a "full text search" solution would be better suited to your needs. MongoDB has full text indexing and search or you can use an external solution.
Anyhow, this is how you mactch your words using a $regex condition.
To actually process your string as input you will need some code before doing the search:
var string = "roi john";
var splits = string.split(" ");
for ( var i = 0; i < splits.length; i++ ) {
splits[i] = "\\b" + splits[i] + "\\b";
}
exp = splits.join("|");
db.collection.find({ "phrase": { "$regex": exp } })
And possibly even combine that with the case insensitive "$option" if that is what you want. That second usage form with the literal $regex operator is actually a safer form form usage in languages other than JavaScript.
using a loop to iterate over the words of s1 and checking with s2 will give the expected result
var s1="roi john";
var s2="hello guys my name is roi levi or maybe roy";
var arr1 = s1.split(" ");
for(var i=0;i<=arr1.length;i++){
if (s2.indexOf(arr1[i]) != -1){
console.log("The string contains "+arr1[i]);
}
}

Scala Map[Regex, String] collectFirst error

I am trying to automatically convert a string to Date based on regex matches. My code thus far is as below:
package be.folks.date
import java.util.Date
import scala.util.matching.Regex
import org.joda.time.format.DateTimeFormat
class StringToDate(underlying:String) {
val regmap : Map[Regex, String] = Map(
("""\d\d-\d\d-\d\d\d\d""".r, "dd-MM-yyyy"),
("""\d\d-\w\w\w-\d\d\d\d""".r, "dd-MMM-yyyy")
)
def toDate() : Date = {
DateTimeFormat.forPattern((regmap collectFirst { case (_(underlying) , v) => v } get)).parseDateTime(underlying).toDate()
}
}
object StringToDate {
implicit def +(s:String) = new StringToDate(s)
}
However, I am getting an error for "_" - ) expected but found (.
How do I correct this?
I'm not sure I understand your syntax to apply the Regex. Maybe, in toDate, you wanted:
regmap collectFirst {
case (pattern , v) if((pattern findFirstIn underlying).nonEmpty) => v}
I also would not use get to extract the string from the option, as it throws an exception if no matching regex is found. I don't know how you want to manage that case in your code so I can't give you suggestions.

SpringMongo Case insensitive search regex

I am trying a case insensitive search in Mongo. Basically I want case insensitive string match I am using regex. Here is my code
Query query = new Query( Criteria.where(propName).regex(value.toString(), "i"));
But the above dosent match my whole string(a string sometime with spaces). It returns values even if its a substring.
Eg: Suppose my collection has 2 values "Bill" and "Bill status',It returns me "bill" even if my search is "bill status". It returns results even if the there is a sub string of the string I am searching for
I tried Query query = new Query( Criteria.where(propName).is(value.toString()));
But the above is case sensitive. Can someone please help on this.
Insensitive search using regex search. input_location could be Delhi,delhi,DELHI, it works for all.
Criteria.where("location").regex( Pattern.compile(input_location, Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE)));
The regex /^bill$/i will match just against "Bill" in a case-insensitive manner.
Here is an example showing this (in the mongo shell):
> db.foo.insert({name: "Bill"});
> db.foo.insert({name: "Bill status"});
> db.foo.insert({name: "another Bill"});
> db.foo.find()
{ "_id" : ObjectId("5018e182a499db774b92bf25"), "name" : "Bill" }
{ "_id" : ObjectId("5018e191a499db774b92bf26"), "name" : "Bill status" }
{ "_id" : ObjectId("5018e19ba499db774b92bf27"), "name" : "another Bill" }
> db.foo.find({name: /bill/i})
{ "_id" : ObjectId("5018e182a499db774b92bf25"), "name" : "Bill" }
{ "_id" : ObjectId("5018e191a499db774b92bf26"), "name" : "Bill status" }
{ "_id" : ObjectId("5018e19ba499db774b92bf27"), "name" : "another Bill" }
> db.foo.find({name: /^bill$/i})
{ "_id" : ObjectId("5018e182a499db774b92bf25"), "name" : "Bill" }
However, a regex query will not use an index, except if it is left-rooted (ie. of the form /^prefix/) and if the i case-insensitive flag is not used. You may pay a substantial performance penalty over using a query that uses an index. As such, depending on your use case, a better alternative might be to use application logic in some way, for example:
Enforce a case when you insert items into the database (e.g. convert "bill" to "Bill").
Do a search against your known case (e.g. search just against "Bill").