mongodb - perform subquery - c++

a document:
{ "_id":1, "id":1, "list" : [ { "lv" : 1 , "id":1}, {"lv" : 2 , "id":2} ] }
I want do find({"_id":1},{"id":1, "list.lv":1}) but limit {"list.lv":1} with an additional condition: "list.id = id". That means I only want to retrieve "id" and the "list.lv" part of the first element in list because its "list.id" == "id" == 1
Normally the condition value provided in code, but in this example, the value is in the document. SQL do this by subquery or join table. Does mongodb support this in a single query? And how to write it in c++ driver?
According to answer, add c++ code:
mongo::BSONObj res;
std::vector<mongo::BSONObj> pipeline;
pipeline.push_back(BSON("$match"<<BSON("_id"<<1)));
pipeline.push_back(BSON("$unwind"<<"$list"));
mongo::BSONArrayBuilder ab;
ab<<"$id"<<"$list.id";
pipeline.push_back(BSON("$project"<<BSON("id"<<1<<"list.lv"<<1<<"equalsFlag"<<BSON("$subtract"<<ab.arr()))));
pipeline.push_back(BSON("$match"<<BSON("equalsFlag"<<0)));
pipeline.push_back(BSON("$project"<<BSON("id"<<1<<"list.lv"<<1)));
conn->runCommand("db_name", BSON( "aggregate" << "collection_name" << "pipeline" << pipeline ), res);
std::cout<<res["result"].Array()[0].Obj().getObjectField("list").getIntField("lv");

If I got you question try this native aggregate framework query to accomplish what you need:
db.collectionName.aggregate(
{"$match" : {"_id" : 1 }},
{"$unwind" : "$list"},
{"$project" : {"id":1, "list.lv" : 1, "equalsFlag" : {"$subtract" : ["$id", "$list.id"]}}},
{"$match" : {"equalsFlag" : 0}},
{"$project" : {"id": 1, "list.lv" : 1}})
Let me explain it in more detail. It's important to filter out as much documents as we can at first. We can do it with the first $match. Note that if we do {"_id" : 1 } filter at the end of the pipeline mongo will not be able to use index for it. $unwind will turn each list array element into a seperate document. Then we need to compare two fields. I'm not aware of any easy way to do it except for $where but we cant use it with aggregate framework. Fortunately both id and list.id are numeric so we can $subtract one from another to see if they are equal, "equalsFlag" : {"$subtract" : ["$id", "$list.id"]} . If they are, equalsFlag will be 0. So we just add a new $match to get documents where id=list.id and finally to omit equalsFlag field from results we have one more $project.
I'm not a C++ guy but I'm sure C++ driver supports aggregate framework as most of other drivers. So just google some examples to convert this native query into a C++ one. This should be fairly easy at least it's true for C#.
EDIT: C++ code from jean to complete the answer
mongo::BSONObj res;
std::vector<mongo::BSONObj> pipeline;
pipeline.push_back(BSON("$match"<<BSON("_id"<<1)));
pipeline.push_back(BSON("$unwind"<<"$list"));
mongo::BSONArrayBuilder ab;
ab<<"$id"<<"$list.id";
pipeline.push_back(BSON("$project"<<BSON("id"<<1<<"list.lv"<<1<<"equalsFlag"<<BSON("$subtract"<<ab.arr()))));
pipeline.push_back(BSON("$match"<<BSON("equalsFlag"<<0)));
pipeline.push_back(BSON("$project"<<BSON("id"<<1<<"list.lv"<<1)));
conn->runCommand("db_name", BSON( "aggregate" << "collection_name" << "pipeline" << pipeline ), res);
std::cout<<res["result"].Array()[0].Obj().getObjectField("list").getIntField("lv");
Hope it helps!

Related

MongoDB query with special characters in key

In my case, I have keys in my MongoDB database that contain a dot in their name (see attached screenshot). I have read that it is possible to store data in MongoDB this way, but the driver prevents queries with dots in the key. Anyway, in my MongoDB database, keys do contain dots and I have to work with them.
I have now tried to encode the dots in the query (. to \u002e) but it did not seem to work. Then I had the idea to work with regex to replace the dots in the query with any character but regex seems to only work for the value and not for the key.
Does anyone have a creative idea how I can get around this problem? For example, I want to have all the CVE numbers for 'cve_results.BusyBox 1.12.1'.
Update #1:
The structure of cve_results is as follows:
"cve_results" : {
"BusyBox 1.12.1" : {
"CVE-2018-1000500" : {
"score2" : "6.8",
"score3" : "8.1",
"cpe_version" : "N/A"
},
"CVE-2018-1000517" : {
"score2" : "7.5",
"score3" : "9.8",
"cpe_version" : "N/A"
}
}}
With the following workaround I was able to directly access documents by their keys, even though they have a dot in their key:
db.getCollection('mycollection').aggregate([
{$match: {mymapfield: {$type: "object" }}}, //filter objects with right field type
{$project: {mymapfield: { $objectToArray: "$mymapfield" }}}, //"unwind" map to array of {k: key, v: value} objects
{$match: {mymapfield: {k: "my.key.with.dot", v: "myvalue"}}} //query
])
If possible, it could be worth inserting documents using \u002e instead of the dot, that way you can query them while retaining the ASCII values of the . for any client rendering.
However, It appears there's a work around to query them like so:
db.collection.aggregate({
$match: {
"BusyBox 1.12.1" : "<value>"
}
})
You should be able to use $eq operator to query fields with dots in names.

Making Dedupe learn from existing label data

I am aware that Dedupe uses Active learning to remove duplicates and perform Record linkage.
However , I would like to know if we can pass excel sheet with already matched pairs(label data) as the input for active learning?
Not directly.
You'll need to get your data into a format that markPairs can consume.
Something like:
labeled_examples = {'match' : [],
'distinct' : [({'name' : 'Georgie Porgie'},
{'name' : 'Georgette Porgette'})]
}
deduper.markPairs(labeled_examples)
We do provide a convenience function for getting spreadsheet data into this format trainingDataDedupe.
(I am an author of dedupe)

How to match nested values in RethinkDB?

I use Python client driver and the structure of my documents is :
{"key1": ["value1"], "key2": ["value2"], ..., "key7": ["value7"]}
let say "value7" is "In every time in every place, deeds of men remain the same"
I'd like to retrieve all documents that contain "deed" for key7.
I tried
r.db('db')
.table('table')
.filter(lambda row: row['key7'].match('^deed'))
.run(conn)
but it doesn't work... I have the follwing message :
rethinkdb.errors.ReqlQueryLogicError: Expected type STRING but found
ARRAY
Here is the solution :
r.db('db')
.table('table')
.filter(lambda row: row['key7'].nth(0).match('^deed'))
.run(conn)

Where condition in geode

I am new to geode .
I am adding like below:
gfsh>put --key=('id':'11') --value=('firstname':'Amaresh','lastname':'Dhal') --region=region
Result : true
Key Class : java.lang.String
Key : ('id':'11')
Value Class : java.lang.String
Old Value : <NULL>
when I query like this:
gfsh>query --query="select * from /region"
Result : true
startCount : 0
endCount : 20
Rows : 9
Result
-----------------------------------------
('firstname':'A2','lastname':'D2')
HI
Amaresh
Amaresh
('firstname':'A1','lastname':'D1')
World
World
('firstname':'Amaresh','lastname':'Dhal')
Hello
NEXT_STEP_NAME : END
When I am trying to query like below I am not getting the value:
gfsh>query --query="select * from /region r where r.id='11'"
Result : true
startCount : 0
endCount : 20
Rows : 0
NEXT_STEP_NAME : END
Ofcourse I can use get command...But i want to use where condition..Where I am doing wrong..It gives no output
Thanks
In Geode the key is not "just another column". In fact, the basic query syntax implicitly queries only the fields of the value. However, you can include the key in your query using this syntax:
select value.lastname, value.firstname from /region.entries where key.id=11
Also, it is fairly common practice to include the id field in your value class even though it is not strictly required.
What Randy said is exactly right, the 'key" is not another column. The exact format of the query should be
gfsh>query --query="select * from /Address.entries where key=2"
What you are looking for here is getting all the "entries" on the region "Address" and then querying the key.
To check which one you want to query you can fire this query
gfsh>query --query="select * from /Address.entries"
You can always use the get command to fetch the data pertaining to a specific key.
get --key=<KEY_NAME> --region=<REGION_NAME>
Example:
get --key=1 --region=Address
Reference: https://gemfire.docs.pivotal.io/910/geode/tools_modules/gfsh/command-pages/get.html

How to query MongoDB for matching documents where item is in document array

If my stored document looks like this:
doc = {
'Things' : [ 'one' , 'two' , 'three' ]
}
How can I query for documents which contain one in Things?
I know the $in operator queries a document item against a list, but this is kind of the reverse. Any help would be awesome.
Use MongoDB's multikeys support:
MongoDB provides an interesting "multikey" feature that can automatically index arrays of an object's values.
[...]
db.articles.find( { tags: 'april' } )
{"name" : "Warm Weather" , "author" : "Steve" ,
"tags" : ["weather","hot","record","april"] ,
"_id" : "497ce4051ca9ca6d3efca323"}
Basically, you don't have to worry about the array-ness of Things, MongoDB will take care of that for you; something like this in the MongoDB shell would work:
db.your_collection.find({ Things: 'one' })