'like' or $regex query inside $cond in MongoDB - regex

Please go through this question of mine:
MongoDB $group and explicit group formation with computed column
But this time, I need to compare strings, not numbers.The CASE query must have a LIKE:
CASE WHEN source LIKE '%Web%' THEN 'Web'
I then need to group by source. How to write this in Mongo? I am trying the following but not sure if $regex is supported inside $cond. By the way, is there a list of valid operators inside $cond somewhere? Looks like $cond isn't very fond of me :)
db.Twitter.aggregate(
{ $project: {
"_id":0,
"Source": {
$cond: [
{ $regex:['$source','/.* Android.*/'] },
'Android',
{ $cond: [
{ $eq: ['$source', 'web'] }, 'Web', 'Others'
] }
]
}
} }
);
There're many other values that I need to write in there, doing a deeper nesting. This is just an example with just 'Android' and 'Web' for the sake of brevity. I have tried both with $eq and $regex. Using $regex gives error of invalid operator whereas using $eq doesn't understand the regex expression and puts everything under 'Others'. If this is possible with regex, kindly let me know how to write it for case-insensitive match.
Thanks for any help :-)

Well, it still seems to be not even scheduled to be implemented :(
https://jira.mongodb.org/browse/SERVER-8892
I'm using 2.6 and took a peek on 3.0, but it's just not there.
There's one workaround though, if you can project your problem onto a stable substring. Then you can $substr the field and use multiple nested $cond. It's awkward, but it works.

Maybe you can try it with MapReduce.
var map = function()
{
var reg1=new RegExp("(Android)+");
var reg2=new RegExp("(web)+");
if (reg1.test(this.source)){
emit(this._id,'Android');
}
else if (reg2.test(this.source))
{
emit(this._id,'web');
}
}
var reduce = function (key,value){
var reduced = {
id:key,
source:value
}
return reduced;
}
db.Twitter.mapReduce(map,reduce,{out:'map_reduce_result'});
db.map_reduce_result.find();
You can use JavaScript regular expresions instead of MongoDB $regex.

Related

Nested Array search in MongoDB/PyMongo while using aggregate

I am trying to search for a keyword inside array of arrays in a Mongo document.
{
"PRODUCT_NAME" : "Truffle Cake",
"TAGS": [
["Cakes", 100],
["Flowers", 100],
]
}
Usually, I would do something like this and it would work.
db.collection.find( {"TAGS":{"$elemMatch":{ "$elemMatch": {"$in":['search_text']} } }} )
But now, I changed this query to an aggregate based query due to other requirements. I've tried $filter , $match but not able to replicate the above query exactly..
Can anyone convert the above code so that it can directly work with aggregate?
(I use PyMongo)
$match uses the same query syntax as the query language (find), from the docs:
The query syntax is identical to the read operation query syntax;
This means if you have a query that works in a "find", it will also work within a $match stage, like so:
db.collection.aggregate([
{
$match: {
"TAGS": {
"$elemMatch": {
"$elemMatch": {
"$in": [
"Cakes"
]
}
}
}
}
}
])
Check this live on Mongo Playground

MongoDB: Aggregation using $cond with $regex

I am trying to group data in multiple stages.
At the moment my query looks like this:
db.captions.aggregate([
{$project: {
"videoId": "$videoId",
"plainText": "$plainText",
"Group1": {$cond: {if: {$eq: ["plainText", {"$regex": /leave\sa\scomment/i}]},
then: "Yes", else: "No"}}}}
])
I am not sure whether it is actually possible to use the $regex operator within a $cond in the aggregation stage. I would appreciate your help very much!
Thanks in advance
UPDATE: Starting with MongoDB v4.1.11, there finally appears to be a nice solution for your problem which is documented here.
Original answer:
As I wrote in the comments above, $regex does not work inside $cond as of now. There is an open JIRA ticket for that but it's, err, well, open...
In your specific case, I would tend to suggest you solve that topic on the client side unless you're dealing with crazy amounts of input data of which you will always only return small subsets. Judging by your query it would appear like you are always going to retrieve all document just bucketed into two result groups ("Yes" and "No").
If you don't want or cannot solve that topic on the client side, then here is something that uses $facet (MongoDB >= v3.4 required) - it's neither particularly fast nor overly pretty but it might help you to get started.
db.captions.aggregate([{
$facet: { // create two stages that will be processed using the full input data set from the "captions" collection
"CallToActionYes": [{ // the first stage will...
$match: { // only contain documents...
"plainText": /leave\sa\scomment/i // that are allowed by the $regex filter (which could be extended with multiple $or expressions or changed to $in/$nin which accept regular expressions, too)
}
}, {
$addFields: { // for all matching documents...
"CallToAction": "Yes" // we create a new field called "CallsToAction" which will be set to "Yes"
}
}],
"CallToActionNo": [{ // similar as above except we're doing the inverse filter using $not
$match: {
"plainText": { $not: /leave\sa\scomment/i }
}
}, {
$addFields: {
"CallToAction": "No" // and, of course, we set the field to "No"
}
}]
}
}, {
$project: { // we got two arrays of result documents out of the previous stage
"allDocuments" : { $setUnion: [ "$CallToActionYes", "$CallToActionNo" ] } // so let's merge them into a single one called "allDocuments"
}
}, {
$unwind: "$allDocuments" // flatten the "allDocuments" result array
}, {
$replaceRoot: { // restore the original document structure by moving everything inside "allDocuments" up to the top
newRoot: "$allDocuments"
}
}, {
$project: { // include only the two relevant fields in the output (and the _id)
"videoId": 1,
"CallToAction": 1
}
}])
As always with the aggregation framework, it may help to remove individual stages from the end of the pipeline and run the partial query in order to get an understanding of what each individual stage does.

Mongo query syntax error

I don't understand why one of these syntaxs work and the other one doesn't. It's my understanding that they both pretty much mean the same.
This works
{ 'profile.fname' : { $regex: ".*" + this.queryParams.value + ".*", $options: '-i'}},
This does not work
{ profile : { fname : { $regex: ".*" + this.queryParams.value + ".*", $options: '-i'}}},
Example data structure looks like:
{
"_id":"ybhng3YCu4W4MSzz9",
"createdAt":"2016-08-23T10:44:33.088Z",
"emails":[{"address":"xy#z.co.uk","verified":false}],
"profile":
{
"fname":"name",
"lname":"otherName"
},
"roles":["admin"]
}
The first one produces the correct result but the second one produces nothing - as in an empty array. From debugging I know this must be the wrong syntax somewhere but I cannot see it.
I am using meteor as the server side.
If you are querying against an embedded document, you need to use dot notation like you're doing in your first query.
If you supply it a document, it must be an exact match. In this case you'd literally need:
{ profile: { fname: "name", lname :"otherName"} }
Of course this looks like bad design to me; why is profile an embedded document in the first place? Instead of something like this:
{
"_id":"ybhng3YCu4W4MSzz9",
"createdAt":"2016-08-23T10:44:33.088Z",
"emails":[{"address":"xy#z.co.uk","verified":false}],
"fname":"name",
"lname":"otherName"
"roles":["admin"]
}
Reference: Query on Embedded Documents

node.js - sending regex to server

I've created a client side mongodb interface to talk to server side mongodb.
it's very similar to the mini-mongo implemented in the meteor.
here is an example:
model.find({"field": /search/}).exec(function(err, model){
construct(model);
});
now normally everything works fine except when I use the regex.
and I know what's the problem but I cannot fix it.
the problem, as you have guessed it, is when the regex /regexParameter/ when sent by ajax to server, is converted to "/regexParameter/" and the single quotes(or double) make the regex a normal string.
in the server I have something like this:
var findObject = req.query.findObject // {"field": "/search/"} :(
req.models[config.table]
.find(findObject)
.exec(function(err, model){
return res.json({
error: err,
result: model,
});
});
is there anything I can do to make this work without writing like 100 of lines of code that iterates through each of the findObject and matches every string for a regex...?
Thanks everyone
You are right - you cannot pass RegExp objects between client and server because during serialization they are converted to strings.
Solution? (or perhaps - a workaround)
Use $regex operator in your queries, so you don't need to use RegExp objects.
So this:
{
field: /search/
}
Becomes this:
{
field: {
$regex: 'search'
}
}
Or, giving a case-insensitive search example:
{
field: {
$regex: 'search',
$options: 'i'
}
}
(instead of field: /search/i)
Read more about $regex syntax here (including about some of its restrictions).

Mongodb distinct query with contains query

I have a mongo collection User which contains data like:-
{
id : 1,
name : "gaurav",
skills : "C++ HTML CSS"
}
when I am searching for all users that have C++ skill in it with the following query I am getting correct results as expected
db.user.find({skills:{contains:"C++"}});
But when I am searching all the unique names from the user using the same condition I m not getting any desired result
db.user.distinct('name',{skills:{contains:"C++"}});
Can anyone help me with what I am doing wrong?
The "contains" is not a valid keyword for MongoDB queries. You need $regex which submits a general "regular expression" statement matching the pcre specifications:
db.user.distinct( "name", { "skills": { "$regex": "C\+\+" } })
If using JavaScript as you language then this is also safe:
db.user.distinct( "name", { "skills": /C\+\+/ })
To determine if the string "C++" occurred somewhere within the string value of the field being tested. The + character is reserved in "regex" operations and therefore you need to escape it with a \ char as the standard escaping mechanism.
On your data this is the result:
db.user.distinct( "name", { "skills": { "$regex": "C\+\+" } })
[ "gaurav" ]
Try to use REGEX like below query
db.user.distinct("name",{"skills":{"$regex":"C++.*"}})