Query nested BSON documents with mongo c++ driver - c++

I have a bsoncxx::document::view bsonObjViewand a std::vector<std::string> path that represents keys to the value we are searching for in the BSON document (first key is top level, second key is depth 1, third key depth 2, etc).
I'm trying to write a function that given a path will search the bson document:
bsoncxx::document::element deepFieldAccess(bsoncxx::document::view bsonObj, const std::vector<std::string>& path) {
assert (!path.empty());
// for each key, find the corresponding value at the current depth, next keys will search the value (document) we found at the current depth
for (auto const& currKey: path) {
// get value for currKey
bsonObj = bsonObj.find(currKey);
}
// for every key in the path we found a value at the appropriate level, we return the final value we found with the final key
return bsonObj;
}
How to make the function work? What type should bsonObjbe to allow for such searches within a loop? Also, how to check if a value for currKey has been found?
Also, is there some bsoncxx built in way to do this?
Here is an example json document followed by some paths that point to values inside of it. The final solution should return the corresponding value when given the path:
{
"shopper": {
"Id": "4973860941232342",
"Context": {
"CollapseOrderItems": false,
"IsTest": false
}
},
"SelfIdentifiersData": {
"SelfIdentifierData": [
{
"SelfIdentifierType": {
"SelfIdentifierType": "111"
}
},
{
"SelfIdentifierType": {
"SelfIdentifierType": "2222"
}
}
]
}
}
Example paths:
The path [ shopper -> Id -> targetValue ] points to the string "4973860941232342".
The path [ SelfIdentifiersData -> SelfIdentifierData -> array_idx: 0 -> targetValue ] points to the object { "SelfIdentifierType": { "SelfIdentifierType": "111" } }.
The path [ SelfIdentifiersData -> SelfIdentifierData -> array_idx: 0 -> SelfIdentifierType -> targetValue ] points to the object { "SelfIdentifierType": "111" }.
The path [ SelfIdentifiersData -> SelfIdentifierData -> array_idx: 0 -> SelfIdentifierType -> SelfIdentifierType -> targetValue ] points to the string "111".
Note that the paths are of the type std::vector<std::string> path. So the final solution should return the value that the path points to. It should work for arbitrary depths, and also for paths that point TO array elements (second example path) and THROUGH array elements (last 2 example paths). We assume that the key for an array element at index i is "i".
Update: Currently, the approach suggested by #acm fails for paths with array indices (paths without array indices work fine). Here is all the code to reproduce the issue:
#include <iostream>
#include <bsoncxx/json.hpp>
#include <mongocxx/client.hpp>
#include <mongocxx/instance.hpp>
std::string turnQueryResultIntoString3(bsoncxx::document::element queryResult) {
// check if no result for this query was found
if (!queryResult) {
return "[NO QUERY RESULT]";
}
// hax
bsoncxx::builder::basic::document basic_builder{};
basic_builder.append(bsoncxx::builder::basic::kvp("Your Query Result is the following value ", queryResult.get_value()));
std::string rawResult = bsoncxx::to_json(basic_builder.view());
std::string frontPartRemoved = rawResult.substr(rawResult.find(":") + 2);
std::string backPartRemoved = frontPartRemoved.substr(0, frontPartRemoved.size() - 2);
return backPartRemoved;
}
// TODO this currently fails for paths with array indices
bsoncxx::document::element deepFieldAccess3(bsoncxx::document::view bsonObj, const std::vector<std::string>& path) {
if (path.empty())
return {};
auto keysIter = path.begin();
const auto keysEnd = path.end();
std::string currKey = *keysIter; // for debug purposes
std::cout << "Current key: " << currKey;
auto currElement = bsonObj[*(keysIter++)];
std::string currElementAsString = turnQueryResultIntoString3(currElement); // for debug purposes
std::cout << " Query result for this key: " << currElementAsString << std::endl;
while (currElement && (keysIter != keysEnd)) {
currKey = *keysIter;
std::cout << "Current key: " << currKey;
currElement = currElement[*(keysIter++)];
currElementAsString = turnQueryResultIntoString3(currElement);
std::cout << " Query result for this key: " << currElementAsString << std::endl;
}
return currElement;
}
// execute this function to see that queries with array indices fail
void reproduceIssue() {
std::string testJson = "{\n"
" \"shopper\": {\n"
" \"Id\": \"4973860941232342\",\n"
" \"Context\": {\n"
" \"CollapseOrderItems\": false,\n"
" \"IsTest\": false\n"
" }\n"
" },\n"
" \"SelfIdentifiersData\": {\n"
" \"SelfIdentifierData\": [\n"
" {\n"
" \"SelfIdentifierType\": {\n"
" \"SelfIdentifierType\": \"111\"\n"
" }\n"
" },\n"
" {\n"
" \"SelfIdentifierType\": {\n"
" \"SelfIdentifierType\": \"2222\"\n"
" }\n"
" }\n"
" ]\n"
" }\n"
"}";
// create bson object
bsoncxx::document::value bsonObj = bsoncxx::from_json(testJson);
bsoncxx::document::view bsonObjView = bsonObj.view();
// example query which contains an array index, this fails. Expected query result is "111"
std::vector<std::string> currQuery = {"SelfIdentifiersData", "SelfIdentifierData", "0", "SelfIdentifierType", "SelfIdentifierType"};
// an example query without array indices, this works. Expected query result is "false"
//std::vector<std::string> currQuery = {"shopper", "Context", "CollapseOrderItems"};
bsoncxx::document::element queryResult = deepFieldAccess3(bsonObjView, currQuery);
std::cout << "\n\nGiven query and its result: [ ";
for (auto i: currQuery)
std::cout << i << ' ';
std::cout << "] -> " << turnQueryResultIntoString3(queryResult) << std::endl;
}

There is not a built-in way to to do this, so you will need to write a helper function like the one you outline above.
I believe the issue you are encountering is that the argument to the function is a bsoncxx::document::view, but the return value of view::find is a bsoncxx::document::element. So you need to account for the change of type somewhere in the loop.
I think I would write the function this way:
bsoncxx::document::element deepFieldAccess(bsoncxx::document::view bsonObj, const std::vector<std::string>& path) {
if (path.empty())
return {};
auto keysIter = path.begin();
const auto keysEnd = path.end();
auto currElement = bsonObj[*(keysIter++)];
while (currElement && (keysIter != keysEnd))
currElement = currElement[*(keysIter++)];
return currElement;
}
Note that this will return an invalid bsoncxx::document::element if any part of the path is not found, or if the path attempts to traverse into an object that is not a actually a BSON document or BSON array.

Related

How to modify node element with boost::property_tree and put_value(const Type &value)

I am fairly new to boost::property_tree and I am having a little trouble with what should be a simple task.
I have a default xml file of which is to be copied and made unique with parameters passed into via the ptree & modelModifier(...) function below. All I want to do is parse the xml into a ptree, and then modify the name field (amongst others, but lets just start with name) from "default" to whatever name is passed in from the object_name variable, then write that back into the original ptree.
The function pTreeIterator just iterates through each child and displays its contents.
xml
<?xml version='1.0'?>
<sdf version='1.7'>
<model name='default'>
<link name='link'>
<inertial>
<mass>3.14999</mass>
<inertia>
<ixx>2.86712</ixx>
<ixy>0</ixy>
<ixz>0</ixz>
<iyy>2.86712</iyy>
<iyz>0</iyz>
<izz>0.524998</izz>
</inertia>
<pose>0 0 0 0 -0 0</pose>
</inertial>
</link>
</model>
</sdf>
Code
void ptreeIterator(ptree & pt)
{
using boost::property_tree::ptree;
for (auto&it : pt)
{
std::cout << __FUNCTION__ << std::endl;
std::cout << "------------------------------------------------------" << std::endl;
std::cout << "Iteration output: " << std::endl;
std::cout << "1: pt1: " << it.first << std::endl;
if(pt.get_child_optional(it.first))
{
cout << "pt.get_child_optional(it.first) ---> " << it.first << endl;
ptree pt2 = pt.get_child(it.first);
for (auto&it2 : pt2)
{
std::cout << "\t2: " << "pt2: " << it2.first << " :: " << (std::string)it2.second.data() << std::endl;
if(pt2.get_child_optional(it2.first))
{
ptree pt3 = pt2.get_child(it2.first);
for (auto&it3 : pt3)
{
std::cout << "\t\t3: " << "pt3: " << it3.first << " :: " << (std::string)it3.second.data() << std::endl;
}
}
}
}
}
}
ptree & modelModifier(ptree &model, double px, double py, std::string object_name, uint16_t height)
{
for(auto &it:model){
cout << "it.first = " << it.first << endl;
if(it.first=="model"){
cout << "MODEL TAG" << endl;
model.put_value(object_name);
}
ptreeIterator(model);
}
}
int main(){
ptree ptModel;
const string filenameToInsert = "model.sdf";
std::ifstream ifsModel(filenameToInsert,std::ifstream::binary);
read_xml(ifsModel, ptModel, boost::property_tree::xml_parser::trim_whitespace);
modelModifier(ptModel, 0, 0, "TEST1234567", 30);
return 0;
}
Output
it.first = model
it.second.data()
ptreeIterator
------------------------------------------------------
Iteration output:
1: pt1: model
pt.get_child_optional(it.first) ---> model
2: pt2: <xmlattr> ::
3: pt3: name :: default
Expected Output
it.first = model
it.second.data()
ptreeIterator
------------------------------------------------------
Iteration output:
1: pt1: model
pt.get_child_optional(it.first) ---> model
2: pt2: <xmlattr> ::
3: pt3: name :: TEST1234567
Firstly, your code has UB, since modelModifier doesn't return a value.
The C-style cast in (std::string)it2.second.data() is extremely dangerous as it risks reinterpret_cast-ing unrelated types. There is no reason whatsoever for this kind of blunt casting. Just remove the cast!
Also, ptreeIterator should probably take a ptree const&, not ptree&.
With these fixed, the sample does NOT show the output you claim, instead it prints (Live On Coliru)
it.first = sdf
ptreeIterator
------------------------------------------------------
Iteration output:
1: pt1: sdf
pt.get_child_optional(it.first) ---> sdf
2: pt2: <xmlattr> ::
3: pt3: version :: 1.7
2: pt2: model ::
3: pt3: <xmlattr> ::
3: pt3: link ::
Now even in your question output, you clearly see the difference between the model node and its name attribute, which apparently you want to modify. Just write the code to access that:
it.second.get_child("<xmlattr>.name").put_value(object_name);
This would be correct, assuming that the attribute always exists and instead of ptModel you pass ptModel.get_child("sdf") to modifyModel).
Other Notes: SIMPLIFY!
That said, please simplify the whole thing!
ptree pt2 = pt.get_child(it.first);
Should have been something like
ptree const& pt2 = it.second;
And
the use of get_child_optional only to repeat with get_child is even more wasteful
Good practice is to separate output/query and mutation. So don't call ptreeIterator from inside modelModifier
Also, give functions a good descriptive name (so that you don't have sheepishly explain "The function pTreeIterator just iterates through each child and displays its contents" - just call it displayModel?)
Instead of painstakingly (and flawed) iterating the particular model and printing it in pretty confusing bespoke manner, just use write_xml/write_info/write_json to dump it in a reliable manner.
Listing
Live On Coliru
namespace Model {
void display(ptree const& pt)
{
write_json(std::cout << __FUNCTION__ << "\n---------------\n", pt);
}
void modify(ptree& model, double, double, std::string object_name, uint16_t)
{
for (auto& it : model) {
std::cout << "root = " << it.first << std::endl;
it.second.get_child("model.<xmlattr>.name").put_value(object_name);
}
}
}
Prints
root = sdf
display
---------------
{
"sdf": {
"<xmlattr>": {
"version": "1.7"
},
"model": {
"<xmlattr>": {
"name": "TEST1234567"
},
"link": {
"<xmlattr>": {
"name": "link"
},
"inertial": {
"mass": "3.14999",
"inertia": {
"ixx": "2.86712",
"ixy": "0",
"ixz": "0",
"iyy": "2.86712",
"iyz": "0",
"izz": "0.524998"
},
"pose": "0 0 0 0 -0 0"
}
}
}
}
}
Bonus
In the case that the name attribute might not already exist, the following code would create it on the fly:
void modify(ptree& model, double, double, std::string object_name, uint16_t)
{
ptree value;
value.put_value(object_name);
for (auto& it : model) {
std::cout << "root = " << it.first << std::endl;
it.second.put_child("model.<xmlattr>.name", value);
}
}

Convert rapidjson array iterator to rapidjson::value

How do I convert a rapidjson array iterator to a rapidjson::value?
I do not want answers that focus on how to get the contents of a rapid json array, or how to iterate through it.
I am also very aware that I can access members through the iterator using the example in the rapidjson documentation itr->name, but that is not what I want either. That form of working with rapidjson arrays already appears on many stack overflow questions and the rapidjson docs, and has been covered.
I need to end up with a rapidjson::value when starting with a rapidjson array iterator.
If we have an std::vector<int> than we can assign
std::vector<int>::iterator itMyInt = myvector.begin();
const int & myInt = *itMyInt;
I would expect to be able to do the same thing with a rapidjson array iterator, but my compiler disagrees.
The reason I need a rapidjson::value is that I'd like to reuse the same parsing method to parse the json object when is an element of an array, as I would when parsing that object on its own and not in an array.
Let me demonstrate with my minimal example:
// Rapid JSON Includes
#include <rapidjson/Document.h>
#include <rapidjson/StringBuffer.h>
#include <rapidjson/writer.h>
// Standard Includes
#include <exception>
#include <string>
#include <sstream>
//--------------------------------------------------------------------------------------------------
// NOTE -This stub cannot change
void ParseCar(const rapidjson::Value & carJson)
{
}
//--------------------------------------------------------------------------------------------------
int main()
{
std::string json =
"{"
" \"cars\" : ["
" {"
" \"name\" : \"Fiat\","
" \"price\" : 19.95"
" },"
" {"
" \"name\" : \"FRS\","
" \"price\" : 19995.00"
" }]"
"}";
// Parse the JSON
rapidjson::Document document;
document.Parse(json.c_str());
if (document.HasParseError())
{
// Error - Failed to parse JSON
std::ostringstream msg;
msg << "There was an error parsing the JSON"
<< " Error Code: " << document.GetParseError()
<< " Error Offset: " << document.GetErrorOffset();
throw std::exception(msg.str().c_str());
}
// Cars array
if (!document.HasMember("cars") ||
!document["cars"].IsArray())
{
std::string msg("Expected \"cars\" JSON array");
throw std::exception(msg.c_str());
}
const rapidjson::Value & carsArrayJSON = document["cars"];
/* Doesn't compile - No GetArray method exists
for (auto & carJSON : carsArrayJSON.GetArray())
{
}
*/
for (rapidjson::Value::ConstMemberIterator itCarJSON = carsArrayJSON.MemberBegin(); itCarJSON != carsArrayJSON.MemberEnd(); ++itCarJSON)
{
// Error - const rapidjson::GenericMember<Encoding,Allocator>' to 'const rapidjson::Value
const rapidjson::Value & carJSON = *itCarJSON;
ParseCar(carJSON);
}
return 0;
}
MemberIterator iterates Members of an Object. ValueIterator iterates Values in an Array.
Also, I'm not sure why that loop had issues for you, this compiles for me:
std::string json =
"{"
" \"cars\" : ["
" {"
" \"name\" : \"Fiat\","
" \"price\" : 19.95"
" },"
" {"
" \"name\" : \"FRS\","
" \"price\" : 19995.00"
" }]"
"}";
Document doc;
doc.Parse(json.data());
Value const& cars = doc["cars"];
for (auto& car : cars.GetArray()) {
cout << "name: " << car["name"].GetString() << " "
<< "price: " << car["price"].GetFloat() << endl;
}
And returns:
name: Fiat price: 19.95
name: FRS price: 19995
As for applying your function in a loop:
for (Value& car : cars.GetArray()) {
ParseCar(car);
}
If you must use an iterator:
for (Value::ValueIterator car = cars.Begin(); car != cars.End(); ++car) {
ParseCar(*car);
}
Or with auto:
for (auto car = cars.Begin(); car != cars.End(); ++car) {
ParseCar(*car);
}
Note: this will only work in an array.
Recently, I faced a similar problem and I came out with a different approach.
First, I set up a pointer to the desired path like so:
int myIterator = 0;
string myString = "/Layer1/Layer2/" + to_string(myIterator);
Pointer p(myString.c_str());
I now have a pointer to a variable path in my JSON object that varies with myIterator.
Then, I just used a for loop in which I use the SetValueByPointer function to append the pointer's value to a JSON object.

Parse nested JSON with QJsonDocument in Qt

I am interested in seeing how we can use Qt's QJsonDocument to parse all entries from a simple nested JSON (as I have just started studying this).
nested json example:
{
"city": "London",
"time": "16:42",
"unit_data":
[
{
"unit_data_id": "ABC123",
"unit_data_number": "21"
},
{
"unit_data_id": "DEF456",
"unit_data_number": "12"
}
]
}
I can parse the non-nested parts of it like so:
QJsonObject jObj;
QString city = jObj["city"].toString();
QString time = jObj["time"].toString();
I am not sure what you are asking, but perhaps this might help:
QJsonDocument doc;
doc = QJsonDocument::fromJson("{ "
" \"city\": \"London\", "
" \"time\": \"16:42\", "
" \"unit_data\": "
" [ "
" { "
" \"unit_data_id\": \"ABC123\", "
" \"unit_data_number\": \"21\" "
" }, "
" { "
" \"unit_data_id\": \"DEF456\", "
" \"unit_data_number\": \"12\" "
" } "
" ] "
" }");
// This part you have covered
QJsonObject jObj = doc.object();
qDebug() << "city" << jObj["city"].toString();
qDebug() << "time" << jObj["time"].toString();
// Since unit_data is an array, you need to get it as such
QJsonArray array = jObj["unit_data"].toArray();
// Then you can manually access the elements in the array
QJsonObject ar1 = array.at(0).toObject();
qDebug() << "" << ar1["unit_data_id"].toString();
// Or you can loop over the items in the array
int idx = 0;
for(const QJsonValue& val: array) {
QJsonObject loopObj = val.toObject();
qDebug() << "[" << idx << "] unit_data_id : " << loopObj["unit_data_id"].toString();
qDebug() << "[" << idx << "] unit_data_number: " << loopObj["unit_data_number"].toString();
++idx;
}
The output I get is:
city "London"
time "16:42"
"ABC123"
[ 0 ] unit_data_id : "ABC123"
[ 0 ] unit_data_number: "21"
[ 1 ] unit_data_id : "DEF456"
[ 1 ] unit_data_number: "12"
In JSON notation, everything should be formatted in key-value. Keys are always strings, but values could be string literals ("example"), number literals , arrays ([]) and objects ({}).
QJsonDocument::fromJson(...).object() returns the root object of a given JSON string. Recall that objects are written by {} notation. This method gives you a QJsonObject. This JSON object has 3 keys ("city", "name" and "unit_data") which value of these keys are of type string literal, string literal and array respectively.
So if you want to access the data stored in that array you should do:
QJsonArray array = rootObj["unit_data"].toArray();
Note that arrays don't have keys, they have only values which could be of the three types mentioned above. In this case, the array holds 2 objects which can be treated as other JSON objects. So,
QJsonObject obj = array.at(0).toObject();
Now the obj object points to the following object:
{
"unit_data_id": "ABC123",
"unit_data_number": "21"
}
So, you should now be able to do what you want. :)
It can happen that one of the elements inside your JSON has more elements inside. It can also happen that you don't know the characteristics of the file (or you want to have a general function).
Therefore you can use a function for any JSON:
void traversJson(QJsonObject json_obj){
foreach(const QString& key, json_obj.keys()) {
QJsonValue value = json_obj.value(key);
if(!value.isObject() ){
qDebug() << "Key = " << key << ", Value = " << value;
}
else{
qDebug() << "Nested Key = " << key;
traversJson(value.toObject());
}
}
};

accessing elements from nlohmann json

My JSON file resembles this
{
"active" : false,
"list1" : ["A", "B", "C"],
"objList" : [
{
"key1" : "value1",
"key2" : [ 0, 1 ]
}
]
}
Using nlohmann json now, I've managed to store it and when I do a dump jsonRootNode.dump(), the contents are represented properly.
However I can't find a way to access the contents.
I've tried jsonRootNode["active"], jsonRootNode.get() and using the json::iterator but still can't figure out how to retrieve my contents.
I'm trying to retrieve "active", the array from "list1" and object array from "objList"
The following link explains the ways to access elements in the JSON. In case the link goes out of scope here is the code
#include <json.hpp>
using namespace nlohmann;
int main()
{
// create JSON object
json object =
{
{"the good", "il buono"},
{"the bad", "il cativo"},
{"the ugly", "il brutto"}
};
// output element with key "the ugly"
std::cout << object.at("the ugly") << '\n';
// change element with key "the bad"
object.at("the bad") = "il cattivo";
// output changed array
std::cout << object << '\n';
// try to write at a nonexisting key
try
{
object.at("the fast") = "il rapido";
}
catch (std::out_of_range& e)
{
std::cout << "out of range: " << e.what() << '\n';
}
}
In case anybody else is still looking for the answer.. You can simply access the contents using the same method as for writing to an nlohmann::json object. For example to get values from
json in the question:
{
"active" : false,
"list1" : ["A", "B", "C"],
"objList" : [
{
"key1" : "value1",
"key2" : [ 0, 1 ]
}
]
}
just do:
nlohmann::json jsonData = nlohmann::json::parse(your_json);
std::cout << jsonData["active"] << std::endl; // returns boolean
std::cout << jsonData["list1"] << std::endl; // returns array
If the "objList" was just an object, you can retrieve its values just by:
std::cout << jsonData["objList"]["key1"] << std::endl; // returns string
std::cout << jsonData["objList"]["key2"] << std::endl; // returns array
But since "objList" is a list of key/value pairs, to access its values use:
for(auto &array : jsonData["objList"]) {
std::cout << array["key1"] << std::endl; // returns string
std::cout << array["key2"] << std::endl; // returns array
}
The loop runs only once considering "objList" is array of size 1.
Hope it helps someone
I really like to use this in C++:
for (auto& el : object["list1"].items())
{
std::cout << el.value() << '\n';
}
It will loop over the the array.

boost property tree getting first element

I was wondering if there where some convenient way to access a known index of a list using the path methodology.
My dream method
float v = pt.get<float>("root.list[0]);
Current known method (or something like it)
ptree::value_type listElement;
BOOST_FOREACH(listElement,tree.get_child("root.list")){
return listElement.second.get<float>();
}
Format of List (json)
{
root:{
list:[1,2,3,4,5]
}
}
You should be able to access the range of elements in the list using boost::property_tree::equal_range. With the JSON you format you are using there is no name element associated with each item in the list. This means that it is necessary to get the parent node prior to accessing the child elements in the range.
The code below is a crude example that you could adapt:
Input Json File (in.json) :
{
"root" :
{
"list" : [1,2,3,4,5]
}
}
Function to print the nth element of the list:
void display_list_elem( const ptree& pt, unsigned idx )
{
// note: the node elements have no name value, ergo we cannot get
// them directly, therefor we must access the parent node,
// and then get the children separately
// access the list node
BOOST_AUTO( listNode, pt.get_child("root.list") );
// get the children, i.e. the list elements
std::pair< ptree::const_assoc_iterator,
ptree::const_assoc_iterator > bounds = listNode.equal_range( "" );
std::cout << "Size of list : " << std::distance( bounds.first, bounds.second ) << "\n";
if ( idx > std::distance( bounds.first, bounds.second ) )
{
std::cerr << "ERROR Index too big\n";
return;
}
else
{
std::advance( bounds.first, idx );
std::cout << "Value # idx[" << idx << "] = "
<< bounds.first->second.get_value<std::string>() << "\n";
}
std::cout << "Displaying bounds....\n";
display_ptree( bounds.first->second, 10 );
}