I want to use regex in the aggregation's pipeline.
I originally did:
regex = '(foo|bar)'
regexDB = re.compile(regex, re.I|re.U )
db.col.find({
'events.display': True,
'$or' : [
{ 'events.description': regexDB } ,
{ 'events.title' : regexDB }
]
})
So I tried in the $match:
regex = '(foo|bar)'
regexDB = re.compile(regex, re.I|re.U )
db.col.aggregate([
{'$unwind': "$events"},
{'$match' :
{
'events.display' : True,
'$or' : [
{ '$events.description': regexDB } ,
{ '$events.title' : regexDB }
]
}
})
However it does not seem to work this way. The above code is based on this example. I also found an example here for mongo's shell.
How can I perform regex in pymongo's aggregation?
Related
I'm trying to do a regex match inside an aggregation pipeline $lookup
So lets assume the following query:
$lookup: {
from: 'some-collection',
let: {
someIds: '$someIds'
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$in: ['$someId', '$$someIds']
},
{
$not: {
$eq: ['$status', 'archived']
}
}
]
}
}
}
]
}
This all works great, i can match on multiple conditions, and it works.
However if i want to add another condition using an array of regex i can't get it to work
$lookup: {
from: 'some-collection',
let: {
someIds: '$someIds'
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$in: ['$someId', '$$someIds']
},
{
$not: {
$eq: ['$status', 'archived']
}
},
{
$in: ['$some-type', [/type1/, /type2/]]
}
]
}
}
}
]
}
Why does this not work? as i understand it from the documentation i should be able to use regex this way inside an $in operator, and i can confirm that it works, since we use it elsewhere. However nested within a $lookuppipeline it does not.
Is this a bug or am i overlooking something? Is there another way i can do this kind of regex match?
Evidently, the problem appears to be that i was attempting to regex match inside the $expr operator, im unsure as to why it does not work, and i can't find anything within the documentation.
But by moving it to a seperate match within the pipeline it worked.
$lookup: {
from: 'some-collection',
let: {
someIds: '$someIds'
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$in: ['$someId', '$$someIds']
},
{
$not: {
$eq: ['$status', 'archived']
}
}
]
}
}
},
{
$match: {
some-type: {
$in: [/type1/, /type2/]
}
}
}
]
}
If anyone can elaborate on why this is the case feel free
{
"_id" : ObjectId("5bd6ed6a49ba281f5c54f185"),
"AvatarSet" : {
"Avatar" : [
{
"IsPrimaryAvatar" : true,
"ProfilePictureUrl" : "https://blob.blob.core.windows.net/avatarcontainer/avatardba36759-3e8e-4666-bc2b-e53ffb527716.jpeg?version=8b1b58b3-94f8-4608-b4db-05746eea8bfe"
}
]
}
Here I need to Replace only https://blob.blob.core.windows.net to every candidateID present in the database please help me how to write MongoDB Query for this?
I'm using Query but it's not working
db.getCollection("candidate-staging")
.find({},{"AvatarSet":[0]})..forEach(function(e) {
e.ProfilePictureUrl= e.ProfilePictureUrl.replace("https://blob.blob.core.windows.net", "https://blob123.blob.core.windows.net");
db.candidate-staging.save(e);
});
The problem in your script is that the ProfilePictureUrl is not properly referred, using dot notation like in the example below should solve the problem.
In your code e.ProfilePictureUrl points to a missing field in the top level document, while doc.AvatarSet.Avatar[0].ProfilePictureUrl in the following example points to the ProfilePictureUrl field for the first element in the Avatar array under the AvatarSet field from the main document.
db.test.find({}).forEach(function(doc) {
doc.AvatarSet.Avatar[0].ProfilePictureUrl= doc.AvatarSet.Avatar[0].ProfilePictureUrl.replace("https://blob.blob.core.windows.net", "https://blob123.blob.core.windows.net");
db.test.save(doc);
});
Local test:
mongos> db.test.find()
{ "_id" : ObjectId("5bdb5e3c553c271478a9a006"), "AvatarSet" : { "Avatar" : [ { "IsPrimaryAvatar" : true, "ProfilePictureUrl" : "https://blob.blob.core.windows.net/avatarcontainer/avatardba36759-3e8e-4666-bc2b-e53ffb527716.jpeg?version=8b1b58b3-94f8-4608-b4db-05746eea8bfe" } ] } }
{ "_id" : ObjectId("5bdb5e3e553c271478a9a007"), "AvatarSet" : { "Avatar" : [ { "IsPrimaryAvatar" : true, "ProfilePictureUrl" : "https://blob.blob.core.windows.net/avatarcontainer/avatardba36759-3e8e-4666-bc2b-e53ffb527716.jpeg?version=8b1b58b3-94f8-4608-b4db-05746eea8bfe" } ] } }
mongos> db.test.find({}).forEach(function(doc) {
doc.AvatarSet.Avatar[0].ProfilePictureUrl= doc.AvatarSet.Avatar[0].ProfilePictureUrl.replace("https://blob.blob.core.windows.net", "https://blob123.blob.core.windows.net");
db.test.save(doc); });
mongos> db.test.find()
{ "_id" : ObjectId("5bdb5e3c553c271478a9a006"), "AvatarSet" : { "Avatar" : [ { "IsPrimaryAvatar" : true, "ProfilePictureUrl" : "https://blob123.blob.core.windows.net/avatarcontainer/avatardba36759-3e8e-4666-bc2b-e53ffb527716.jpeg?version=8b1b58b3-94f8-4608-b4db-05746eea8bfe" } ] } }
{ "_id" : ObjectId("5bdb5e3e553c271478a9a007"), "AvatarSet" : { "Avatar" : [ { "IsPrimaryAvatar" : true, "ProfilePictureUrl" : "https://blob123.blob.core.windows.net/avatarcontainer/avatardba36759-3e8e-4666-bc2b-e53ffb527716.jpeg?version=8b1b58b3-94f8-4608-b4db-05746eea8bfe" } ] } }
In this code contains objects of an array of the object In this code reach AvatarSetArray points to a missing field in the top-level document because we need to access objects within the Another Array so we need to write another loop for 'Avatar' Array like e.AvatarSet.Avatar.forEach its really works. it's work for me.
db.getCollection("test").find({}).forEach(function(e,i) {
e.AvatarSet.Avatar.forEach(function(url, j) {
url.ProfilePictureUrl = url.ProfilePictureUrl.replace("https://blob.blob.core.windows.net", "https://blob123.blob.core.windows.net");
e.AvatarSet.Avatar[j] = url;
});
db.getCollection("test").save(e);
eval(printjson(e));
})
thanks!! manfonton and stackoverflow
I have a document like below:
{
_id: 1,
data: [ { zip: 001, city: "abc" }, { zip: 002, city: "xyz" } ]
}
I want to filter data array using python regex. But it doesn't seem to be working.
city = "abc"
regx = re.compile("^%s$" %city, re.IGNORECASE|re.MULTILINE)
for doc in db.testusers.aggregate([ { "$project": { "data": { "$filter": { "input": "$data", "as": "item", "cond": { "$eq": [ "$$item.city", regx ] } } } } } ]):
json.dumps(doc)
It doesn't match anything.
Am I doing it right?
I think $filter does not support regex. See doc.
I cannot test this here but it should work like according to this sample:
city_list = ["cityAbc", "Metroid"]
city_list = [re.compile("^" + str(c_id) + "$", re.IGNORECASE) for c_id in city_list]
pipe = [ { "$match" : { "_id":{"$in" : city_list}}},
{ "$unwind" : "$rp"},
{"$group":{"_id": "$_id", "rp": { "$push": "$rp" }}} , {"$limit":500}]
res = list(db.coll.aggregate(pipeline = pipe,allowDiskUse=True))
Working native query
{
$match: {
$and : [
{userType:"200"},
{
$or: [
{login : /infosys/},
{email : /infosys/},
{firstName : /infosys/},
{lastName : /infosys/}
]
}
]
}
}
SpringData API which is not working as expected:
match(
Criteria.where("userType").is(userType).orOperator(
Criteria.where("login").regex(searchTxt).orOperator(
Criteria.where("email").regex(searchTxt).orOperator(
Criteria.where("firstName").regex(searchTxt).orOperator(Criteria.where("lastName").regex(searchTxt))
)
)
)
);
You are $or each criteria with the $or operator. orOperator takes a list of crtieria.
Below is the equivalent for your native query.
match
(
Criteria.where("userType").is(userType)
.orOperator(
Criteria.where("login").regex(searchTxt),
Criteria.where("email").regex(searchTxt),
Criteria.where("firstName").regex(searchTxt),
Criteria.where("lastName").regex(searchTxt)
)
)
Possible duplicate
What I'm specifically looking for are analogs for the LEFT, SUBSTRING and REPLACE functions from SQL in mongo.
After researching this for some time, I can't find any direct analogs for those functions nor am I seasoned enough with mongodb that I can see another way of performing an equivalent operation.
An example of a similar query that I'm looking for is as follows:
REPLACE(case when LEFT(title,1) = '"' then SUBSTRING(title, 2, LEN(title)) else title end,char(9),'')
Which is to be used as part of a $project.
Cheers.
String operations you need:
{ $concat: [ <expression1>, <expression2>, ... ] }
{ $substr: [ <string>, <start>, <length> ] }
Examples:
{ "_id" : 1, "item" : "ABC1", quarter: "13Q1", "description" : "product 1" }
{ "_id" : 2, "item" : "ABC2", quarter: "13Q4", "description" : "product 2" }
{ "_id" : 3, "item" : "XYZ1", quarter: "14Q2", "description" : null }
Concat:
db.inventory.aggregate(
[
{ $project: { itemDescription: { $concat: [ "$item", " - ", "$description" ] } } }
]
)
Substr:
db.inventory.aggregate(
[
{
$project:
{
item: 1,
yearSubstring: { $substr: [ "$quarter", 0, 2 ] },
quarterSubtring: { $substr: [ "$quarter", 2, -1 ] }
}
}
]
)
Look at the string operations here. I think you should combine $cond-operator with the string ones.