Pulling MarkLogic template view data - powerbi

I am new to TDE. For the document below, I ended up developing 5 templates and then was able to write a JOIN query (below). I pulled all document data by linking the views via __docid fragment ID.
It works fine when run in Query Console. However, when I tried to pull the same data say to PowerBI via ODBC then I cannot write the query because __docid is not getting passed.
Here are my questions:
How can I assign __docid value to a view field?
If not possible, can I create a single template for the document?
Any other solution?
Thanks in advance.
URI: /json/2017.04.27_ID_NA_SL/chambers_2730.json
Document:
{
"class": "sanction",
"sanction": ==> Template 1
{
"batch": "2017.04.27_ID_NA_SL",
"id": "2017.04.27_IN_NA_SL/chambers_2730",
"date_board_order": "2017-04-27T00:00:00",
"date_effective": null,
"decision": null,
"reasoning": null,
"pas_code": null,
"method": "web",
"orig": "results/results_04_27_2017_04_50PM/ID_SummaryList_03_04PM_February_27_2017/ID-John_chambers- 04_27_2017_BO.pdf",
"professional": ==> Template 2
{
"name_first": "John",
"name_middle": null,
"name_last": "chambers",
"license": null,
"me": "0499999999"
}
}
,
"app":
{
"assignment": ==> Template 3
{
"me": "Jessica Hernendez",
"pas": "Jessica Hernendez"
}
,
"status": ==> Template 4
{
"state": "complete",
"me_complete": "true",
"pas_complete": "true"
}
,
"meta": ==> Template 5
{
"alert": null,
"note": null
}
}
}
Query:
SELECT t.__docid, p.name_first, p.name_middle, p.name_last, p.license, p.meta,
s.batch,s.id,s.date_order,s.orig, a.me, t.state
FROM sanction s
JOIN professional p ON s.__docid=p.__docid
JOIN assignment a ON s.__docid = a.__docid
JOIN status t ON s.__docid = t.__docid
ORDER BY p.name_last

I am not sure if you can literally insert the value of __docid into a TDE field, but you can use xdmp:node-uri(.) instead. That will return the database uri, which is guaranteed unique.
I do wonder if you need 5 templates though. Your data doesn't seem to have repeated elements, so why not create one wide view that holds all sanction data? You could consider it a special purpose view optimized for PowerBI, and save effort on unnecessary joins at runtime.
HTH!

Related

Loading savedsearch in Suitescript doesnt include all columns. NetSuite

When loading a saved search in suitescript it doesnt include all columns, for example the summed columns in the end are not included. I tried getResults function but because im loading this in mapreduce getInputData function, because of huge data the script timelimit gets exceeded (SSS_TIME_LIMIT_EXCEEDED).
From the below screenshot the marked columns are not visible when i:
function getInputData(){
var mainSrch = search.load({ id: 'customsearch1000' });
return mainSrch;
}
Below is the result i get in the script:
{
"recordType": null,
"id": "16187",
"values": {
"GROUP(trandate)": "22/06/2022",
"GROUP(type)": {
"value": "VendBill",
"text": "Bill"
},
"GROUP(tranid)": "36380",
"GROUP(location)": {
"value": "140",
"text": "ACBD"
},
"GROUP(custitem_item_category.item)": {
"value": "13",
"text": "Frozen Food"
},
"GROUP(custitem_item_subcategory.item)": {
"value": "66",
"text": "Frozen Fruits & Vegetables"
},
"GROUP(itemid.item)": "MN-FGGH10271310",
"GROUP(displayname.item)": "ABC Product",
"GROUP(custcol_po_line_barcode)": "883638668390",
"GROUP(locationquantityonhand.item)": "4",
"SUM(quantity)": "1",
"SUM(totalvalue.item)": "4460.831",
"SUM(custcol_po_unit_price)": "8.00",
"SUM(formulanumeric)": "0"
}
}
Is there any way to get all the columns while loading saved search?
I haven't seen this particular issue before but Netsuite does have an issue sorting by any formulaX column other than the first one so seeing this is not surprising.
If you have no selection criteria on the aggregate values you could:
modify your search to have no summary types or formula numeric columns
in the map phase group them by the original search's grouping columns (no governance cost)
in the reduce phase calculate the values for the formulanumeric columns (no governance cost)
proceed with your original reduce phase logic.
As an alternative to my previous answer you can split your process into parts.
Modify you saved search to include column labels
Use N/task to schedule your search with a map reduce script as a dependency using addInboundDependency
If your search finishes successfully the map reduce script will be called with your search file
return the file from your getInputData phase. You'll have to modify your map/reduce script to handle a different format but if your search can complete at all you'll be able to process it.
Below is a fragment of a script that does this but uses a schedules script as the dependency. Map/reduce scripts are also supported.
var filePath = folderPath+ (folderPath.length ? '/' : '') + name;
var searchTask = task.create({
taskType: task.TaskType.SEARCH,
savedSearchId: searchId,
filePath: filePath
});
var dependency = task.create({
taskType:task.TaskType.SCHEDULED_SCRIPT,
scriptId:'customscript_kotn_s3_defer_transfer',
deploymentId:deferredDeployment,
params:{
custscript_kotn_deferred_s3_folder: me.getParameter({name:'custscript_kotn_s3_folder'}),
custscript_kotn_deferred_s3_file: filePath
}
});
searchTask.addInboundDependency(dependency);
var taskId = searchTask.submit();
log.audit({
title:'queued '+ name,
details: taskId
});

MongoDB: Aggregation using $cond with $regex

I am trying to group data in multiple stages.
At the moment my query looks like this:
db.captions.aggregate([
{$project: {
"videoId": "$videoId",
"plainText": "$plainText",
"Group1": {$cond: {if: {$eq: ["plainText", {"$regex": /leave\sa\scomment/i}]},
then: "Yes", else: "No"}}}}
])
I am not sure whether it is actually possible to use the $regex operator within a $cond in the aggregation stage. I would appreciate your help very much!
Thanks in advance
UPDATE: Starting with MongoDB v4.1.11, there finally appears to be a nice solution for your problem which is documented here.
Original answer:
As I wrote in the comments above, $regex does not work inside $cond as of now. There is an open JIRA ticket for that but it's, err, well, open...
In your specific case, I would tend to suggest you solve that topic on the client side unless you're dealing with crazy amounts of input data of which you will always only return small subsets. Judging by your query it would appear like you are always going to retrieve all document just bucketed into two result groups ("Yes" and "No").
If you don't want or cannot solve that topic on the client side, then here is something that uses $facet (MongoDB >= v3.4 required) - it's neither particularly fast nor overly pretty but it might help you to get started.
db.captions.aggregate([{
$facet: { // create two stages that will be processed using the full input data set from the "captions" collection
"CallToActionYes": [{ // the first stage will...
$match: { // only contain documents...
"plainText": /leave\sa\scomment/i // that are allowed by the $regex filter (which could be extended with multiple $or expressions or changed to $in/$nin which accept regular expressions, too)
}
}, {
$addFields: { // for all matching documents...
"CallToAction": "Yes" // we create a new field called "CallsToAction" which will be set to "Yes"
}
}],
"CallToActionNo": [{ // similar as above except we're doing the inverse filter using $not
$match: {
"plainText": { $not: /leave\sa\scomment/i }
}
}, {
$addFields: {
"CallToAction": "No" // and, of course, we set the field to "No"
}
}]
}
}, {
$project: { // we got two arrays of result documents out of the previous stage
"allDocuments" : { $setUnion: [ "$CallToActionYes", "$CallToActionNo" ] } // so let's merge them into a single one called "allDocuments"
}
}, {
$unwind: "$allDocuments" // flatten the "allDocuments" result array
}, {
$replaceRoot: { // restore the original document structure by moving everything inside "allDocuments" up to the top
newRoot: "$allDocuments"
}
}, {
$project: { // include only the two relevant fields in the output (and the _id)
"videoId": 1,
"CallToAction": 1
}
}])
As always with the aggregation framework, it may help to remove individual stages from the end of the pipeline and run the partial query in order to get an understanding of what each individual stage does.

distinct value with count and condition mongo DB

I am new to MongoDB, and so far it seems like it is trying to go out of it's way to make doing simple things overly complex.
I am trying to run the below MYSQL equivalent
SELECT userid, COUNT(*)
FROM userinfo
WHERE userdata like '%PC% or userdata like '%wire%'
GROUP BY userid
I have mongo version 3.0.4 and i am running MongoChef.
I tried using something like the below:
db.userinfo.group({
"key": {
"userid": true
},
"initial": {
"countstar": 0
},
"reduce": function(obj, prev) {
prev.countstar++;
},
"cond": {
"$or": [{
"userdata": /PC/
}, {
"userdata": /wire/
}]
}
});
but that did not like the OR.
when I took out the OR, thinking I’d do half at a time and combine results in excel, i got an error "group() can't handle more than 20000 unique keys", and the result table should be much bigger than that.
From what I can tell online, I could do this using aggregation pipelines, but I cannot find any clear examples of how to do that.
This seems like it should be a simple thing that should be built in to any DB, and it makes no sense to me that it is not.
Any help is much appreciated.
/
Works "sooo" much better with the .aggregate() method, as .group() is a very outmoded way of approaching this:
db.userinfo.aggregate([
{ "$match": {
"userdata": { "$in":[/PC/,/wire/] }
}},
{ "$group": {
"_id": "$userid",
"count": { "$sum": 1 }
}}
])
The $in here is a much shorter way of writing your $or condition as well.
This is native code as opposed to JavaScript translation as well, so it runs much faster.
Here is an example which counts the distinct number of first_name values for records with a last_name value of “smith”:
db.collection.distinct("first_name", {“last_name”:”smith”}).length;
output
3

Create / Update multiple objects from one API response

all new jsfiddle: http://jsfiddle.net/vJxvc/2/
Currently, i query an api that will return JSON like this. The API cannot be changed for now, which is why I need to work around that.
[
{"timestamp":1406111961, "values":[1236.181, 1157.695, 698.231]},
{"timestamp":1406111970, "values":[1273.455, 1153.577, 693.591]}
]
(could be a lot more lines, of course)
As you can see, each line has a timestamp and then an array of values. My problem is, that i would actually like to transpose that. Looking at the first line alone:
{"timestamp":1406111961, "values":[1236.181, 1157.695, 698.231]}
It contains a few measurements taken at the same time. This would need to become this in my ember project:
{
"sensor_id": 1, // can be derived from the array index
"timestamp": 1406111961,
"value": 1236.181
},
{
"sensor_id": 2,
"timestamp": 1406111961,
"value": 1157.695
},
{
"sensor_id": 3,
"timestamp": 1406111961,
"value": 698.231
}
And those values would have to be pushed into the respective sensor models.
The transformation itself is trivial, but i have no idea where i would put it in ember and how i could alter many ember models at the same time.
you could make your model an array and override the normalize method on your adapter. The normalize method is where you do the transformation, and since your json is an array, an Ember.Array as a model would work.
I am not a ember pro but looking at the manual I would think of something like this:
a = [
{"timestamp":1406111961, "values":[1236.181, 1157.695, 698.231]},
{"timestamp":1406111970, "values":[1273.455, 1153.577, 693.591]}
];
b = [];
a.forEach(function(item) {
item.values.forEach(function(value, sensor_id) {
b.push({
sensor_id: sensor_id,
timestamp: item.timestamp,
value: value
});
});
});
console.log(b);
Example http://jsfiddle.net/kRUV4/
Update
Just saw your jsfiddle... You can geht the store like this: How to get Ember Data's "store" from anywhere in the application so that I can do store.find()?

How do Django Fixtures handle ManyToManyFields?

I'm trying to load in around 30k xml files from clinicaltrials.gov into a mySQL database, and the way I am handling multiple locations, keywords, etc. are in a separate model using ManyToManyFields.
The best way I've figured out is to read the data in using a fixture. So my question is, how do I handle the fields where the data is a pointer to another model?
I unfortunately don't know enough about how ManyToMany/ForeignKeys work, to be able to answer...
Thanks for the help, sample code below: __ represent the ManyToMany fields
{
"pk": trial_id,
"model": trials.trial,
"fields": {
"trial_id": trial_id,
"brief_title": brief_title,
"official_title": official_title,
"brief_summary": brief_summary,
"detailed_Description": detailed_description,
"overall_status": overall_status,
"phase": phase,
"enrollment": enrollment,
"study_type": study_type,
"condition": _______________,
"elligibility": elligibility,
"Criteria": ______________,
"overall_contact": _______________,
"location": ___________,
"lastchanged_date": lastchanged_date,
"firstreceived_date": firstreceived_date,
"keyword": __________,
"condition_mesh": condition_mesh,
}
}
A foreign key is simple the pk of the object you are linking to, a manytomanyfield uses a list of pk's. so
[
{
"pk":1,
"model":farm.fruit,
"fields":{
"name" : "Apple",
"color" : "Green",
}
},
{
"pk":2,
"model":farm.fruit,
"fields":{
"name" : "Orange",
"color" : "Orange",
}
},
{
"pk":3,
"model":person.farmer,
"fields":{
"name":"Bill",
"favorite":1,
"likes":[1,2],
}
}
]
You will need to probably write a conversion script to get this done. Fixtures can be very flimsy; it's difficult to get the working so experiment with a subset before you spend a lot of time converting the 30k records (only to find they might not import)