How to retrieve distinct properties of documents - mapreduce

In our CouchDB document database, we have documents with different "status" property values like this:
doc1: {status: "available"},
doc2: {status: "reserved"},
doc3: {status: "available"},
doc4: {status: "sold"},
doc5: {status: "available"},
doc6: {status: "destroyed"},
doc7: {status: "sold"}
[...]
Now, I would like to write a map-reduce function that returns all distinct status values that exist over all documents: ["available", "reserved", "sold", "destroyed"].
My approach was to begin writing a map function that returns only the "status" property of each document:
function (doc) {
if(doc.status) {
emit(doc._id, doc.status);
}
}
And now, I would like to compare all map rows to each other such that no status duplicates will be returned.
The official CouchDB documentation seems to be very detailed and technical, but cannot really be projected to our use case, which does not have any nested structures like in blog posts but simply "flat objects" with a "status" property. Besides, our backend uses PouchDB as an adapter to connect to our remote CouchDB.
I discovered that when executing the reduce function below (which I implemented myself trying to understand what happens under the hood), some strange result will be returned.
function(keys, values, rereduce) {
var array = [];
if(rereduce) {
return values;
} else {
if(array.indexOf(values[0]) === -1) {
array.push(values[0]);
}
}
return array;
}
Result:
{
"rows": [
{
"key": null,
"value": "[reduce] [status] available,available,[status] sold,unknown,[status] available,[status] available,[status] available,reserved,available,[status] reserved,available,[status] available,[status] sold,reserved,[status] sold,sold,[status] available,available,[status] reserved,[status] reserved,[status] available,[status] reserved,available"
}
]
}
The reduce step seems to be executed exactly once, while the status loops sometimes have only a single value, then two or three values, without a recognizable logic or pattern.
Could somebody please explain to me the following:
How to retrieve an array with all distinct status values
What is the logic (or workflow) of the reduce function of CouchDB? Why do status rows have an arbitrary number of status values?

Thanks to #chrisinmtown's comment I was able to implement the distinct retrieval of status values using the following functions:
function map(doc) {
if(doc.status) {
emit(doc.status, null);
}
}
function reduce(key, values) {
return null;
}
It is important to send the query parameter group = true as well, otherwise the result will be empty:
// PouchDB request
return this.database.query('general/all-status', { group: true }).pipe(
map((response: PouchDB.Query.Response<any>) => response.rows.map((row: any) => row.key))
);
See also the official PouchDB documentation for further information how to use views and queries.

Related

Assert a string contains a certain value (and fail the test if it doesn't)

As part of my nightwatch.js testing, I have the following code that will list all the values of an element (in this case, a list of UK towns);
"Page 2 Location SEO Crawl paths are displayed": function (browser) {
browser.elements('xpath', '//a[contains(#href,"location")]', function (results) {
results.value.map(function(element) {
browser.elementIdAttribute(element.ELEMENT, 'innerText', function(res) {
var resme = res.value;
console.log(resme)
});
});
});
},
This correctly list all the element values, as such;
What I'd now like to do is check that Nottingham is listed in the result, and Fail the test if it's not.
I installed the assert npm package to see if that would help, which changed my code to;
"Page 2 Location SEO Crawl paths are displayed": function (browser) {
browser.elements('xpath', '//a[contains(#href,"location")]', function (results) {
results.value.map(function(element) {
browser.elementIdAttribute(element.ELEMENT, 'innerText', function(res) {
var resme = res.value;
console.log(resme);
if (resme.includes("Nottingham")) {
assert.ok(true);
}
else {
assert.ok(false);
}
});
});
});
},
but this didn't work, as I kept getting the following error;
Is using the assert package the best way of testing this, or it there a more straightforward way of asserting that Nottingham is included in this list, and the tests fails if it's not.
I've tried using resme.includes("Nottingham"), but this doesn't fail the test.
Any help would be greatly appreciated.
Thanks.
Looks like the inner-most function (the one that has res as parameter) is called for every item, and resme is the item you are currently iterating, not an array, so the includes function is not working as expected.
I'm not familiar with this library but I guess you have to do something like this:
"Page 2 Location SEO Crawl paths are displayed": function (browser) {
var found = false;
browser.elements('xpath', '//a[contains(#href,"location")]', function (results) {
results.value.map(function(element) {
browser.elementIdAttribute(element.ELEMENT, 'innerText', function(res) {
var resme = res.value;
console.log(resme);
if (resme === "Nottingham") {
found = true;
// Maybe return something like null or 0 to stop iterating (would depend on your library).
}
});
});
});
assert.ok(found)
},
You init a variable "found" with a false value, and when you iterate over every value you set it to true if you find it. Optionally, you should break the iteration at that point. When the whole process is finished, assert that you found the value you wanted.

DynamoDB multiple filter conditions, gives error - buildTree error: unset parameter: ConditionBuilder

I am building REST APIs, using Lambda and DynamoDB in GO.
I need to query the data based on multiple filters.
The number of filters can be varying based on the number of query parameters user has provided, while calling the REST API.
As per the below post, I had developed the code to add multiple conditions.
AWS SDK for Go - DynamoDb - Add multiple conditions to FilterExpression
But when I invoke the function, I get below error, in the logs.-
buildTree error: unset parameter: ConditionBuilder
The Filter expression is not applied and the scan returns all the results.
Here is the code snippet.
for queryParam, queryParamValue := range searchParams {
fmt.Println("queryParam:", queryParam, "=>", "queryParamValue:", queryParamValue)
if queryParam == “param1” {
param1Condition = expression.Name(“param1”).Equal(expression.Value(queryParamValue))
}
if queryParam == “param2” {
param2Condition = expression.Name(“param2”).Equal(expression.Value(queryParamValue))
}
}
sampleExpr, errSample := expression.NewBuilder().
WithCondition(param1Condition.Or(param2Condition)).
Build()
if errSample != nil {
fmt.Println("Error in building Sample Expr ", errSample)
} else {
fmt.Println("sampleExpr ", sampleExpr)
}
input := &dynamodb.ScanInput{
ExpressionAttributeNames: sampleExpr.Names(),
ExpressionAttributeValues: sampleExpr.Values(),
FilterExpression: sampleExpr.Filter(),
TableName: aws.String(deviceInfotable),
}
But if I create the expression in different way, it works.
filt := expression.Name("param1").Equal(expression.Value("valu1")).Or(expression.Name("param2").Equal(expression.Value("value2")))
ConditionBuilder has mode field
type ConditionBuilder struct {
operandList []OperandBuilder
conditionList []ConditionBuilder
mode conditionMode
}
The zero value of mode is unsetCond. When build condition, unsetCond raises the error.
https://github.com/aws/aws-sdk-go/blob/7798c2e0edc02ba058f7672d32f4ebf6603b5fc6/service/dynamodb/expression/condition.go#L1415
case unsetCond:
return exprNode{}, newUnsetParameterError("buildTree", "ConditionBuilder")
In your code, if queryParam != “param1” and queryParam != “param2”, the param1Condition and param2Condition is zero value of ConditionBuilder, which fails on build.

SuiteScript 2.0 Map Reduce Script Complete Sample

I was hit SSS USAGE LIMIT EXCEEDED error in Netsuite.
I plan to change the search to use Map Reduce Script, however, I didn't found any complete example to call Map Reduce Script, like how to pass parameter to Map Reduce Script and get the resultset from it. Would you please show me how? Thanks in advance
the below show how to define the task to call Map Reduce Script
SuiteScript 2.0 UserEvent Script to Call Map Reduce
define(['N/record', 'N/log', 'N/Task'],
function (record, log, task) {
function setFieldInRecord (scriptContext) {
log.debug({
'title': 'TESTING',
'details': 'WE ARE IN THE FUNCTION!'
});
if (scriptContext.type === scriptContext.UserEventType.EDIT) {
var scriptTask = task.create({
taskType: task.TaskType.MAP_REDUCE
});
scriptTask.scriptId = 'customscript_id';
scriptTask.deploymentId = 'customdeploy_id';
var scriptTaskId = scriptTask.submit();
//How to pass parameter to getInputData?
//How to get the result?
}
}
return {
beforeSubmit: setFieldInRecord
};
}
);
Map/Reduce script type provides you with 4 entry point functions to load/process your data:
getInputData(inputContext)
map(mapContext)
reduce(reduceContext)
summarize(summaryContext)
Example:
function summarize(context) {
context.output.iterator().each(function(key, value) {
// your logic here
return true;
});
}
Take a look at this help center section, there are examples (only available with NetSuite account):
https://system.netsuite.com/app/help/helpcenter.nl?fid=section_4387799161.html

How to return an JSON object I made in a reduce function

I need your help about CouchDB reduce function.
I have some docs like:
{'about':'1', 'foo':'a1','bar':'qwe'}
{'about':'1', 'foo':'a1','bar':'rty'}
{'about':'1', 'foo':'a2','bar':'uio'}
{'about':'1', 'foo':'a1','bar':'iop'}
{'about':'2', 'foo':'b1','bar':'qsd'}
{'about':'2', 'foo':'b1','bar':'fgh'}
{'about':'3', 'foo':'c1','bar':'wxc'}
{'about':'3', 'foo':'c2','bar':'vbn'}
As you can seen they all have the same key, just the values are differents.
My purpse is to use a Map/Reduce and my return expectation would be:
'rows':[ 'keys':'1','value':{'1':{'foo':'a1', 'at':'rty'},
'2':{'foo':'a2', 'at':'uio'},
'3':{'foo':'a1', 'at':'iop'}}
'keys':'1','value':{'foo':'a1', 'bar','rty'}
...
'keys':'3','value':{'foo':'c2', 'bar',vbn'}
]
Here is the result of my Map function:
'rows':[ 'keys':'1','value':{'foo':'a1', 'bar','qwe'}
'keys':'1','value':{'foo':'a1', 'bar','rty'}
...
'keys':'3','value':{'foo':'c2', 'bar',vbn'}
]
But my Reduce function isn't working:
function(keys,values,rereduce){
var res= {};
var lastCheck = values[0];
for(i=0; i<values.length;++i)
{
value = values[i];
if (lastCheck.foo != value.foo)
{
res.append({'change':[i:lastCheck]});
}
lastCheck = value;
}
return res;
}
Is it possible to have what I expect or I need to use an other way ?
You should not do this in the reduce function. As the couchdb wiki explains:-
If you are building a composite return structure in your reduce, or only transforming the values field, rather than summarizing it, you might be misusing this feature.
There are two approaches that you can take instead
Transform the results at your application layer.
Use the list function.
Lists functions are simple. I will try to explain them here:
Lists like views are saved in design documents under the key lists. Like so:
"lists":{
"formatResults" : "function(head,req) {....}"
}
To call the list function you use a url like this
http://localhost:5984/your-database/_design/your-designdoc/_list/your-list-function/your-view-name
Here is an example of list function
function(head, req) {
var row = getRow();
if (!row){
return 'no ingredients'
}
var jsonOb = {};
while(row=getRow()){
//construct the json object here
}
return {"body":jsonOb,"headers":{"Content-Type" : "application/json"}};
}
The getRow function is of interest to us. It contains the result of the view. So we can query it like
row.key for key
row.value for value
All you have to do now is construct the json like you want and then send it.
By the way you can use log
to debug your functions.
I hope this helps a little.
Apparently now you need to use
provides('json', function() { ... });
As in:
Simplify Couchdb JSON response

Mongodb - regex match of keys for subdocuments

I have some documents saved in a collection (called urls) that look like this:
{
payload:{
url_google.com:{
url:'google.com',
text:'search'
}
}
},
{
payload:{
url_t.co:{
url:'t.co',
text:'url shortener'
}
}
},
{
payload:{
url_facebook.com:{
url:'facebook.com',
text:'social network'
}
}
}
Using the mongo CLI, is it possible to look for subdocuments of payload that match /^url_/? And, if that's possible, would it also be possible to query on the match's subdocuments (for example, make sure text exists)?
I was thinking something like this:
db.urls.find({"payload":{"$regex":/^url_/}}).count();
But that's returning 0 results.
Any help or suggestions would be great.
Thanks,
Matt
It's not possible to query against document keys in this way. You can search for exact matches using $exists, but you cannot find key names that match a pattern.
I assume (perhaps incorrectly) that you're trying to find documents which have a URL sub-document, and that not all documents will have this? Why not push that type information down a level, something like:
{
payload: {
type: "url",
url: "Facebook.com",
...
}
}
Then you could query like:
db.foo.find({"payload.type": "url", ...})
I would also be remiss if I did not note that you shouldn't use dots (.) is key names in MongoDB. In some cases it's possible to create documents like this, but it will cause great confusions as you attempt to query into embedded documents (where Mongo uses dot as a "path separator" so to speak).
You can do it but you need to use aggregation: Aggregation is pipeline where each stage is applied to each document. You have a wide range of stages to perform various tasks.
I wrote an aggregate pipeline for this specific problem. If you don't need the count but the documents itself you might want to have a look at the $replaceRoot stage.
EDIT: This works only from Mongo v3.4.4 onwards (thanks for the hint #hwase0ng)
db.getCollection('urls').aggregate([
{
// creating a nested array with keys and values
// of the payload subdocument.
// all other fields of the original document
// are removed and only the filed arrayofkeyvalue persists
"$project": {
"arrayofkeyvalue": {
"$objectToArray": "$$ROOT.payload"
}
}
},
{
"$project": {
// extract only the keys of the array
"urlKeys": "$arrayofkeyvalue.k"
}
},
{
// merge all documents
"$group": {
// _id is mandatory and can be set
// in our case to any value
"_id": 1,
// create one big (unfortunately double
// nested) array with the keys
"urls": {
"$push": "$urlKeys"
}
}
},
{
// "explode" the array and create
// one document for each entry
"$unwind": "$urls"
},
{
// "explode" again as the arry
// is nested twice ...
"$unwind": "$urls"
},
{
// now "query" the documents
// with your regex
"$match": {
"urls": {
"$regex": /url_/
}
}
},
{
// finally count the number of
// matched documents
"$count": "count"
}
])