How to use regex inside in query using morphia?

How to use regex inside in query using morphia? - regex

Mongodb allows regex expression of pattern /pattern/ without using $regex expression.
http://docs.mongodb.org/manual/reference/operator/query/in/
How can i do it using morphia ?
If i give Field criteria with field operator as in and value of type "java.util.regex.Pattern" then the equivalent query generated in
$in:[$regex: 'given pattern'] which wont return expected results at all.
Expectation: $in :[ /pattern1 here/,/pattern2 here/]
Actual using 'Pattern' object : $in : [$regex:/pattern1 here/,$regex:/pattern 2 here/]

I'm not entirely sure what to make of your code examples, but here's a working Morphia code snippet:
Pattern regexp = Pattern.compile("^" + email + "$", Pattern.CASE_INSENSITIVE);
mongoDatastore.find(EmployeeEntity.class).filter("email", regexp).get();
Note that this is really slow. It can't use an index and will always require a full collection scan, so avoid it at all cost!
Update: I've added a specific code example. The $in is not required to search inside an array. Simply use /^I/ as you would in string:
> db.profile.find()
{
"_id": ObjectId("54f3ac3fa63f282f56de64bd"),
"tags": [
"India",
"Australia",
"Indonesia"
]
}
{
"_id": ObjectId("54f3ac4da63f282f56de64be"),
"tags": [
"Island",
"Antigua"
]
}
{
"_id": ObjectId("54f3ac5ca63f282f56de64bf"),
"tags": [
"Spain",
"Mexico"
]
}
{
"_id": ObjectId("54f3ac6da63f282f56de64c0"),
"tags": [
"Israel"
]
}
{
"_id": ObjectId("54f3ad17a63f282f56de64c1"),
"tags": [
"Germany",
"Indonesia"
]
}
{
"_id": ObjectId("54f3ad56a63f282f56de64c2"),
"tags": [
"ireland"
]
}
> db.profile.find({ tags: /^I/ })
{
"_id": ObjectId("54f3ac3fa63f282f56de64bd"),
"tags": [
"India",
"Australia",
"Indonesia"
]
}
{
"_id": ObjectId("54f3ac4da63f282f56de64be"),
"tags": [
"Island",
"Antigua"
]
}
{
"_id": ObjectId("54f3ac6da63f282f56de64c0"),
"tags": [
"Israel"
]
}
{
"_id": ObjectId("54f3ad17a63f282f56de64c1"),
"tags": [
"Germany",
"Indonesia"
]
}
Note: The position in the array makes no difference, but the search is case sensitive. Use /^I/i if this is not desired or Pattern.CASE_INSENSITIVE in Java.

Single RegEx Filter
use .filter(), .criteria(), or .field()
query.filter("email", Pattern.compile("reg.*exp"));
// or
query.criteria("email").contains("reg.*exp");
// or
query.field("email").contains("reg.*exp");
Morphia converts this into:
find({"email": { $regex: "reg.*exp" } })
Multiple RegEx Filters
query.or(
query.criteria("email").contains("reg.*exp"),
query.criteria("email").contains("reg.*exp.*2"),
query.criteria("email").contains("reg.*exp.*3")
);
Morphia converts this into:
find({"$or" : [
{"email": {"$regex": "reg.*exp"}},
{"email": {"$regex": "reg.*exp.*2"}},
{"email": {"$regex": "reg.*exp.*3"}}
]
})
Unfortunately,
You cannot use $regex operator expressions inside an $in.
MongoDB Manual 3.4
Otherwise, we could do:
Pattern[] patterns = new Pattern[] {
Pattern.compile("reg.*exp"),
Pattern.compile("reg.*exp.*2"),
Pattern.compile("reg.*exp.*3"),
};
query.field().in(patterns);
hopefully, one day morphia will support that :)

Related

How to extract an element in an array if the filter element is 2 levels down

My ListInputSecurityGroup task returns this json:
{
"output": [
{
"Arn": "arn:aws:medialive:eu-north-1:xxx:inputSecurityGroup:1977625",
"Id": "1977625",
"Inputs": [],
"State": "IDLE",
"Tags": {},
"WhitelistRules": [
{
"Cidr": "5.5.5.5/32"
}
]
},
{
"Arn": "arn:aws:medialive:eu-north-1:xxx:inputSecurityGroup:5411101",
"Id": "5411101",
"Inputs": [],
"State": "IDLE",
"Tags": {
"use": "some_other_use"
},
"WhitelistRules": [
{
"Cidr": "1.1.1.1/0"
}
]
},
{
"Arn": "arn:aws:medialive:eu-north-1:xxx:inputSecurityGroup:825926",
"Id": "825926",
"Inputs": [
"4011716"
],
"State": "IN_USE",
"Tags": {
"use": "for_rtmp_pipeline"
},
"WhitelistRules": [
{
"Cidr": "0.0.0.0/0"
}
]
}
]
}
I want to use OutputPath to extract the InputSecurityGroup with the tag {use:for_rtmp_pipeline}. According to this JSONPath tester this expression works $.output[?(#.Tags.use == for_rtmp_pipeline)] and it returns the 3rd element in this array. But when used in the StepFunction itself, or in the Data Flow Simulator, it doesn't return anything. Is this a limitation of the JSONPath engine in AWS, or is there a different syntaxis? How can I extract the one element I want?
Note that in the tester the searched string should be in quotes, while in AWS there's no need for quotes.

Regex expression for getting e mail fields without domain

I have an email address like example.regex#yahoo.com. Is there a regex expression that will match example.regex#yahoo.com, example.regex, example, regex? The expression should not match Yahoo.com, Yahoo or com
Have the following e-mail address:
denisa.example#yahoo.com and I want the following strings to match the query:
denisa.example
denisa
example
I already tried it with the following elastic-search analyzer query:
{
"settings": {
"analysis": {
"filter": {
"email": {
"type": "pattern_capture",
"preserve_original": true,
"patterns": [
"([^#]+)",
"(\\p{L}+)",
"(\\d+)",
"#(.+)"
]
}
},
"analyzer": {
"email": {
"tokenizer": "uax_url_email",
"filter": [
"email",
"lowercase",
"unique"
]
}
}
}
}
}
but it gives me the following results:
denisa.example
denisa
example
yahoo.com
yahoo
com
I found an answer :
"patterns" : [
"^(.*?)#",
"(\\w+(?=.*#))"].
Thanks!

You can do something like that:
function extract(email) {
const name = email.match(/^(.*?)#.*/)[1];
return [
name,
...name.split(".")
]
}
console.log(extract("example.regex#yahoo.com"));
If you check your browser console, you will print something like that:
(3) ["example.regex", "example", "regex"]

How to match array of sub string with array of string using mongo?

I have follwoing collection structure -
{
"_id": ObjectId("54c784d71e14acf9ae833f9f"),
"vms": [
{
"name": "ABC",
"ids": [
"abc.60a980004270457730244662385a4f69",
"abc.60a980004270457730244662385a4f6d"
]
},
{
"name": "PQR",
"ids": [
"abc.6d867d9c7acd60001aed76eb2c70bd53",
"abc.60a980004270457730244662385a4f6d"
]
},
{
"name": "XYZ",
"ids": [
"abc.600605b00237d91016cdc38f376bd31d",
"abc.600605b00237d91016cdc38f376cd32f"
]
}
]
}
I have an array which contains substrings of ids. here is an array for your reference -
myArray = [ "4270457730244662385a4f69","4270457730244662385a4f6d" , "4270457730244662385a4f6b"]
I want to find each element of myArray is not present in ids as a substring using mongo.
Currently I am able to find single element using regex in mongo.
In above example, I want output as:
[
{
"name": "XYZ",
"ids": [
"abc.600605b00237d91016cdc38f376bd31d",
"abc.600605b00237d91016cdc38f376cd32f"
]
}
]
How do I find substring in array using mongo??

It is possible to do it using regex. You can match the string for multiple substrings using or operator. It is | in regex. Search for 'Boolean "or"' on wikipedia
MongoDB query using aggregation:
db.collection_name.aggregate([
{$unwind: "$vms"},
{$match: {
"vms.ids": {$not: /.*(4270457730244662385a4f69|4270457730244662385a4f6d|4270457730244662385a4f6b).*/}}
}
])
Output will be
{
"_id" : ObjectId("54c784d71e14acf9ae833f9f"),
"vms" : {
"name" : "XYZ",
"ids" : [
"abc.600605b00237d91016cdc38f376bd31d",
"abc.600605b00237d91016cdc38f376cd32f"
]
}
}

Unwind array of objects mongoDB

My collection contains the following two documents
{
"BornYear": 2000,
"Type": "Zebra",
"Owners": [
{
"Name": "James Bond",
"Phone": "007"
}
]
}
{
"BornYear": 2012,
"Type": "Dog",
"Owners": [
{
"Name": "James Brown",
"Phone": "123"
},
{
"Name": "Sarah Frater",
"Phone": "345"
}
]
}
I would like to find all the animals whichs have an owner called something with James.
I try to unwind the Owners array, but cannot get access to the Name variable.

Bit of a misnomer here. To just find the "objects" or items in a "collection" then all you really need to do is match the "object/item"
db.collection.find({
"Owners.Name": /^James/
})
Which works, but does not of course limit the results to the "first" match of "James", which would be:
db.collection.find(
{ "Owners.Name": /^James/ },
{ "Owners.$": 1 }
)
As a basic projection. But that does not give any more than a "single" match, which means you need the .aggregate() method instead like so:
db.collection.aggregate([
// Match the document
{ "$match": {
"Owners.Name": /^James/
}},
// Flatten or de-normalize the array
{ "$unwind": "Owners" },
// Filter th content
{ "$match": {
"Owners.Name": /^James/
}},
// Maybe group it back
{ "$group": {
"_id": "$_id",
"BornYear": { "$first": "$BornYear" },
"Type": { "$first": "$Type" },
"Ownners": { "$push": "$Owners" }
}}
])
And that allows more than one match in a sub-document array while filtering.
The other point is the "anchor" or "^" caret on the regular expression. You really need it where you can, to make matches at the "start" of the string where an index can be properly used. Open ended regex operations cannot use an index.

You can use dot notation to match against the fields of array elements:
db.test.find({'Owners.Name': /James/})

MongoDB distinct records [duplicate]

Basically i'm trying to implement tags functionality on a model.
> db.event.distinct("tags")
[ "bar", "foo", "foobar" ]
Doing a simple distinct query retrieves me all distinct tags. However how would i go about getting all distinct tags that match a certain query? Say for example i wanted to get all tags matching foo and then expecting to get ["foo","foobar"] as a result?
The following queries is my failed attempts of achieving this:
> db.event.distinct("tags",/foo/)
[ "bar", "foo", "foobar" ]
> db.event.distinct("tags",{tags: {$regex: 'foo'}})
[ "bar", "foo", "foobar" ]

The aggregation framework and not the .distinct() command:
db.event.aggregate([
// De-normalize the array content to separate documents
{ "$unwind": "$tags" },
// Filter the de-normalized content to remove non-matches
{ "$match": { "tags": /foo/ } },
// Group the "like" terms as the "key"
{ "$group": {
"_id": "$tags"
}}
])
You are probably better of using an "anchor" to the beginning of the regex is you mean from the "start" of the string. And also doing this $match before you process $unwind as well:
db.event.aggregate([
// Match the possible documents. Always the best approach
{ "$match": { "tags": /^foo/ } },
// De-normalize the array content to separate documents
{ "$unwind": "$tags" },
// Now "filter" the content to actual matches
{ "$match": { "tags": /^foo/ } },
// Group the "like" terms as the "key"
{ "$group": {
"_id": "$tags"
}}
])
That makes sure you are not processing $unwind on every document in the collection and only those that possibly contain your "matched tags" value before you "filter" to make sure.
The really "complex" way to somewhat mitigate large arrays with possible matches takes a bit more work, and MongoDB 2.6 or greater:
db.event.aggregate([
{ "$match": { "tags": /^foo/ } },
{ "$project": {
"tags": { "$setDifference": [
{ "$map": {
"input": "$tags",
"as": "el",
"in": { "$cond": [
{ "$eq": [
{ "$substr": [ "$$el", 0, 3 ] },
"foo"
]},
"$$el",
false
]}
}},
[false]
]}
}},
{ "$unwind": "$tags" },
{ "$group": { "_id": "$tags" }}
])
So $map is a nice "in-line" processor of arrays but it can only go so far. The $setDifference operator negates the false matches, but ultimately you still need to process $unwind to do the remaining $group stage for distinct values overall.
The advantage here is that arrays are now "reduced" to only the "tags" element that matches. Just don't use this when you want a "count" of the occurrences when there are "multiple distinct" values in the same document. But again, there are other ways to handle that.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to use regex inside in query using morphia? - regex

Related

How to extract an element in an array if the filter element is 2 levels down

Regex expression for getting e mail fields without domain

How to match array of sub string with array of string using mongo?

Unwind array of objects mongoDB

MongoDB distinct records [duplicate]

Categories

Resources