I'm trying to use a data mapper mediator to trasform a complex JSON, I need to change names of fields and convert null strings to empty strings.
Input:
{
"field1" : "value 1",
"field2" : "value 2",
"field3" : null,
"field4" : null,
[...]
}
Output:
{
"One" : "value 1",
"Two" : "value 2",
"Three" : "",
"Four" : "",
[...]
}
I've implemented my own nullToEmpty function and used in a CustomFunction Operation , but I cannot reuse it, if I do it is duplicated for any field that use it.
Is there a better way to act this trasformation?
Thanks
At the moment, there is no way to reuse custom functions with the DataMapper. Unfortunately, you will have to duplicate the code.
Related
I'm using json for modern c++.
And I have a json file which contains some data like:
{
"London": {
"Adress": "londonas iela 123",
"Name": "London",
"Shortname": "LL"
},
"Riga": {
"Adrese": "lidostas iela 1",
"Name": "Riga",
"Shortname": "RIX"
}
And I found out a way to modify the values of "Adrese", "Name", "Shortname".
As you can see I have "name" and key element name set to the same thing.
But I need to change both the key element and value "name".
So at the end when somehow in the code I modify it, it would look like:
{
"Something_New": {
"Adress": "londonas iela 123",
"Name": "Something_New",
"Shortname": "LL"
},
"Riga": {
"Adrese": "lidostas iela 1",
"Name": "Riga",
"Shortname": "RIX"
}
I've tried:
/other_code/
json j
/functions_for_opening_json file/
j["London"]["Name"] = "Something_New"; //this changes the value "name"
j["London"] = "Something_New"; //But this replaces "London" with
"Something_new" and deletes all of its inside values.
Then I tried something like:
for(auto& el : j.items()){
if(el.key() == "London"){
el.key() = "Something_New";}
}
But that didn't work either.
I would like something like j["London"] = "Something_new", and for it to keep all the values that originally was for "London".
The value associated with key "London" is the entire subtree json object containing the other 3 keys with their values. This line j["London"] = "Something_New"; does not change the key, "London" but its value. So you end up with the pair "London" : "Something new", overwriting the json subtree object. The keys are stored internally as std::map . Therefore you can't simply rename a key like that. Try:
void change_key(json &j, const std::string& oldKey, const std::string& newKey)
{
auto itr = j.find(oldKey); // try catch this, handle case when key is not found
std::swap(j[newKey], itr.value());
object.erase(itr);
}
And then
change_key(j, "London", "Something_New");
I want to create a small MongoDB Search Query where I want to sort the result set based exact match followed by no. of matches.
For eg. if I have following labels
Physics
11th-Physics
JEE-IIT-Physics
Physics-Physics
Then, if I search for "Physics" it should sort as
Physics
Physics-Physics
11th-Physics
JEE-IIT-Physics
Looking for the sort of "scoring" you are talking about here is an excercise in "imperfect solutions". In this case, the "best fit" here starts with "text search", and "imperfect" is the term to consider first when working with the text search capabilties of MongoDB.
MongoDB is "not" a dedicated "text search" product, nor is it ( like most databases ) trying to be one. Full capabilites of "text search" is reserved for dedicated products that do that as there area of expertise. So maybe not the best fit, but "text search" is given as an option for those who can live with the limitations and don't want to implement another engine. Or Yet! At least.
With that said, let's look at what you can do with the data sample as given. First set up some data in a collection:
db.junk.insert([
{ "data": "Physics" },
{ "data": "11th-Physics" },
{ "data": "JEE-IIT-Physics" },
{ "data": "Physics-Physics" },
{ "data": "Something Unrelated" }
])
Then of course to "enable" the text search capabilties, then you need to index at least one of the fields in the document with the "text" index type:
db.junk.createIndex({ "data": "text" })
Now that is "ready to go", let's have a look at a first basic query:
db.junk.find(
{ "$text": { "$search": "\"Physics\"" } },
{ "score": { "$meta": "textScore" } }
).sort({ "score": { "$meta": "textScore" } })
That is going to give results like this:
{
"_id" : ObjectId("55af83b964876554be823f33"),
"data" : "Physics-Physics",
"score" : 1.5
}
{
"_id" : ObjectId("55af83b964876554be823f30"),
"data" : "Physics",
"score" : 1
}
{
"_id" : ObjectId("55af83b964876554be823f31"),
"data" : "11th-Physics",
"score" : 0.75
}
{
"_id" : ObjectId("55af83b964876554be823f32"),
"data" : "JEE-IIT-Physics",
"score" : 0.6666666666666666
}
So that is "close" to your desired result, but of course there is no "exact match" component. In addition, the logic here used by the text search capabilities with the $text operator means that "Physics-Physics" is the preferred match here.
This is because then engine does not recognize "non words" such as the "hyphen" in between. To it, the word "Physics" appears several times in the indexed content for the document, therefore it has a higher score.
Now the rest of your logic here depends on the application of "exact match" and what you mean by that. If you are looking for "Physics" in the string and "not" surrounded by "hyphens" or other characters then the following does not suit. But you can just match a field "value" that is "exactly" just "Physics":
db.junk.aggregate([
{ "$match": {
"$text": { "$search": "Physics" }
}},
{ "$project": {
"data": 1,
"score": {
"$add": [
{ "$meta": "textScore" },
{ "$cond": [
{ "$eq": [ "$data", "Physics" ] },
10,
0
]}
]
}
}},
{ "$sort": { "score": -1 } }
])
And that will give you a result that both looks at the "textScore" produced by the engine and then applies some math with a logical test. In this case where the "data" is exactly equal to "Physics" then we "weight" the score by an additional factor using $add:
{
"_id": ObjectId("55af83b964876554be823f30"),
"data" : "Physics",
"score" : 11
}
{
"_id" : ObjectId("55af83b964876554be823f33"),
"data" : "Physics-Physics",
"score" : 1.5
}
{
"_id" : ObjectId("55af83b964876554be823f31"),
"data" : "11th-Physics",
"score" : 0.75
}
{
"_id" : ObjectId("55af83b964876554be823f32"),
"data" : "JEE-IIT-Physics",
"score" : 0.6666666666666666
}
That is what the aggregation framework can do for you, by allowing manipulation of the returned data with additional conditions. The end result is passed to the $sort stage ( notice it is reversed in descending order ) to allow that new value to be to sorting key.
But the aggregation framework can really only deal with "exact matches" like this on strings. There is no facility at present to deal with regular expression matches or index positions in strings that return a meaningful value for projection. Not even a logical match. And the $regex operation is only used to "filter" in queries, so not of use here.
So if you were looking for something in a "phrase" thats was a bit more invovled than a "string equals" exact match, then the other option is using mapReduce.
This is another "imperfect" approach as the limitations of the mapReduce command mean that the "textScore" from such a query by the engine is "completely gone". While the actual documents will be selected correctly, the inherrent "ranking data" is not available to the engine. This is a by-product of how MongoDB "projects" the "score" into the document in the first place, and "projection" is not a feature available to mapReduce.
But you can "play with" the strings using JavaScript, as in my "imperfect" sample:
db.junk.mapReduce(
function() {
var _id = this._id,
score = 0;
delete this._id;
score += this.data.indexOf(search);
score += this.data.lastIndexOf(search);
emit({ "score": score, "id": _id }, this);
},
function() {},
{
"out": { "inline": 1 },
"query": { "$text": { "$search": "Physics" } },
"scope": { "search": "Physics" }
}
)
Which gives results like this:
{
"_id" : {
"score" : 0,
"id" : ObjectId("55af83b964876554be823f30")
},
"value" : {
"data" : "Physics"
}
},
{
"_id" : {
"score" : 8,
"id" : ObjectId("55af83b964876554be823f33")
},
"value" : {
"data" : "Physics-Physics"
}
},
{
"_id" : {
"score" : 10,
"id" : ObjectId("55af83b964876554be823f31")
},
"value" : {
"data" : "11th-Physics"
}
},
{
"_id" : {
"score" : 16,
"id" : ObjectId("55af83b964876554be823f32")
},
"value" : {
"data" : "JEE-IIT-Physics"
}
}
My own "silly little algorithm" here is basically taking both the "first" and "last" index position of the matched string here and adding them together to produce a score. It's likely not what you really want, but the point is that if you can code your logic in JavaScript, then you can throw it at the engine to produce the desired "ranking".
The only real "trick" here to remember is that the "score" must be the "preceeding" part of the grouping "key" here, and that if including the orginal document _id value then that composite key part must be renamed, otherwise the _id will take precedence of order.
This is just part of mapReduce where as an "optimization" all output "key" values are sorted in "ascending order" before being processed by the reducer. Which of course does nothing here since we are not "aggregating", but just using the JavaScript runner and document reshaping of mapReduce in general.
So the overall note is, those are the available options. None of them perfect, but you might be able to live with them or even just "accept" the default engine result.
If you want more then look at external "dedicated" text search products, which would be better suited.
Side Note: The $text searches here are preferred over $regex because they can use an index. A "non-anchored" regular expression ( without the caret ^ ) cannot use an index optimally with MongoDB. Therefore the $text searches are generally going to be a better base for finding "words" within a phrase.
One more way is using the $indexOfCp aggregation operator to get the index of matched string and then apply sort on the indexed field
Data insertion
db.junk.insert([
{ "data": "Physics" },
{ "data": "11th-Physics" },
{ "data": "JEE-IIT-Physics" },
{ "data": "Physics-Physics" },
{ "data": "Something Unrelated" }
])
Query
const data = "Physics";
db.junk.aggregate([
{ "$match": { "data": { "$regex": data, "$options": "i" }}},
{ "$addFields": { "score": { "$indexOfCP": [{ "$toLower": "$data" }, { "$toLower": data }]}}},
{ "$sort": { "score": 1 }}
])
Here you can test the output
[
{
"_id": ObjectId("5a934e000102030405000000"),
"data": "Physics",
"score": 0
},
{
"_id": ObjectId("5a934e000102030405000003"),
"data": "Physics-Physics",
"score": 0
},
{
"_id": ObjectId("5a934e000102030405000001"),
"data": "11th-Physics",
"score": 5
},
{
"_id": ObjectId("5a934e000102030405000002"),
"data": "JEE-IIT-Physics",
"score": 8
}
]
I am fairly new to JasperReports and am having a challenge getting list data to show up correctly from MongoDB.
I was working off of an article, but cannot seem to get it to work.
I have the following collection in MongoDB:
{ "_id" : ObjectId("51e24462945f8796ea8e731d"), "id" : "1001", "cust" : "abc", "
lines" : [ { "line number" : "line1", "product" : "ProdA" },
{ "line number" : "line2", "product" : "ProdB" } ] }
{ "_id" : ObjectId("51e246fb945f8796ea8e731e"), "id" : "1002", "cust" : "abc", "
lines" : [ { "line number" : "line1", "product" : "ProdA" },
{ "line number" : "line2", "product" : "ProdB" } ] }
"lines" is a collection.
In iReport, it shows up as a list, which is good. However, when I do as the article suggests and change the sub datasource to new net.sf.jasperreports.engine.data.JRMapCollectionDataSource($F{lines}), I still get the List as a string, which just shows up as
[[line number : line1, product: ProdA],[line number : line2, product: ProdB]]
Shouldn't using this JRMapCollectionDataSource parse this out for me already? If not, how do I handle this?
Have you tried to access the list data by using field names "lines.line number" and "lines.product"? That might do the trick.
I figured this out. You have to create an empty data set and from there map the fields to the ${lines} array. For anyone that finds themselves in the same predicament as myself, I highly recommend reading the example JRXML file the author put into the article (something I didn't notice was there at first).
Thanks
By a numeric value I mean a value in the following JSON that should not be contained within double quotes. I've written a one-off workaround for this but a generic REReplace() that can be re-used would be a fantastic help.
So this
{
"collapse_key" : "demo",
"delay_while_idle" : true,
"registration_ids" : ["xyz"],
"data" : {
"key1" : "value1",
"key2" : "value2",
},
"time_to_live" : "3"
},
becomes this:
{
"collapse_key" : "demo",
"delay_while_idle" : true,
"registration_ids" : ["xyz"],
"data" : {
"key1" : "value1",
"key2" : "value2",
},
"time_to_live" : 3
},
This should work:
s = reReplace(s, '"([\d.-]+)"', "\1", "ALL")
(Where s is your JSON string)
" matches a double quote
() says "remember this so I can reference it later as \1
\d means "a digit"
. means a decimal point
- means a minus sign
+ means one or more of them
Note that this will match illegitimate "numbers" like "..0-1", but within the scope of your requirement, this is probably fine. One could convolute the regex to be more precise, but there is perhaps no gain from doing so here. Let me know if there's a false-positive risk here, and I can amend.
Or I imagine Peter is about to give a better answer anyhow ;-)
I am trying to combine regex and embedded object queries and failing miserably. I am either hitting a limitation of mongodb or just getting something slightly wrong maybe someone out ther has encountered this. The documentation certainly does'nt cover this case.
data being queried:
{
"_id" : ObjectId("4f94fe633004c1ef4d892314"),
"productname" : "lightbulb",
"availability" : [
{
"country" : "USA",
"storeCode" : "abc-1234"
},
{
"country" : "USA",
"storeCode" : "xzy-6784"
},
{
"country" : "USA",
"storeCode" : "abc-3454"
},
{
"country" : "CANADA",
"storeCode" : "abc-6845"
}
]
}
assume the collection contains only one record
This query returns 1:
db.testCol.find({"availability":{"country" : "USA","storeCode":"xzy-6784"}}).count();
This query returns 1:
db.testCol.find({"availability.storeCode":/.*/}).count();
But, this query returns 0:
db.testCol.find({"availability":{"country" : "USA","storeCode":/.*/}}).count();
Does anyone understand why? Is this a bug?
thanks
You are referencing the embedded storecode incorrectly - you are referencing it as an embedded object when in fact what you have is an array of objects. Compare these results:
db.testCol.find({"availability.0.storeCode":/x/});
db.testCol.find({"availability.0.storeCode":/a/});
Using your sample doc above, the first one will not return, because the first storeCode does not have an x in it ("abc-1234"), the second will return the document. That's fine for the case where you are looking at a single element of the array and pass in the position. In order to search all of the objcts in the array, you want $elemMatch
As an example, I added this second example doc:
{
"_id" : ObjectId("4f94fe633004c1ef4d892315"),
"productname" : "hammer",
"availability" : [
{
"country" : "USA",
"storeCode" : "abc-1234"
},
]
}
Now, have a look at the results of these queries:
PRIMARY> db.testCol.find({"availability" : {$elemMatch : {"storeCode":/a/}}}).count();
2
PRIMARY> db.testCol.find({"availability" : {$elemMatch : {"storeCode":/x/}}}).count();
1