I'm developing a Power Bi custom visual, but I have a problem: when the user adds dimensions to the visual, the order shown in the UI does not reflect the actual order of the columns in the data I get from Power Bi. See for example this screenshot:
This is very limiting in a lot of scenarios, for example if I want to draw a table with columns in the order that the user sets.
Why does the API behave like this? Doesn't seem logical to me, am I doing something wrong? Here is the data binding definition if it helps:
"dataRoles": [
{
"displayName": "Fields",
"name": "fields",
"kind": "Grouping"
},
{
"displayName": "Measures",
"name": "measures",
"kind": "Measure"
}
],
"dataViewMappings": [
{
"table": {
"rows": {
"select": [
{
"for": {
"in": "fields"
}
},
{
"for":{
"in": "measures"
}
}
],
"dataReductionAlgorithm": {
"window": {
"count": 30000
}
}
}
}
}
]
I think I solved it.
You can get the actual column order like this:
dataView.metadata.columns.forEach(col => {
let columnOrder = col['rolesIndex'].XXX[0];
});
where XXX is the name of the Data Role, which in the example in my question would be fields.
Note that you have to access the rolesIndex property by key name like I did above (or alternatively cast to any before accessing it), because the DataViewMetadataColumn type is missing that property for some reason.
Related
I'm a beginner in Power BI development.
I have a table which looks like this, it has 447 rows:
How do I configure my capabilites.json file to get the data from this table?
I have two GroupingOrMeasure data fields in dataRoles and I limit each to be "max": 1.
The closest I got to get the table's data, is this property, which only returns the 'name' column values correctly:
"dataViewMappings": [
{
"categorical": {
"categories": {
"for": { "in": "name-col" }
},
"values": {
"select": [
{ "bind": { "to": "imports-col" } }
]
}
}
}
]
But with this I don't get the 'imports' column's values, the dataView.categorical.categories has only one array.
My goal is to get both the 'name' and 'imports' column's value, the 'imports' column having multiple values associated to one value of the 'name' column's.
In most cases that I tried, that would return both columns, each source had far fewer values (around 83 each) and the 'name' source having each value listed only once.
Found the solution:
"table": {
"rows": {
"select": [
{ "for": { "in": "name-col" } },
{ "for": { "in": "imports-col" } }
]
}
}
Correctly returns each row.
I'm using vega-lite in PBI and I'm trying to create a histogram in which its X-axis range changes dynamically acording to two fields (Start, End) from a table of my PBI, this two fields repeat in each record as can be seen in image.
PBI
The examples of extent in the documentation I found used only numbers, so my initial code is this:
initial code
Yet I'm trying to do following code:
{
"data": {"name": "dataset"},
"mark": {
"type": "bar",
"tooltip": true
},
"encoding": {
"x": {
"bin": {
"extent": [
{"field": "Start"},
{"field": "End"}
],
"step": 1
},
"field": "Actual"
},
"y": {"aggregate": "count"}
}
}
But it throws a lot of warnings, and in the end it does nothing:
error
Thanks.
Setting bin extent based on field name is not supported. Per the Vega-lite docs, bin.range is
A two-element ([min, max]) array indicating the range of desired bin values.
where min and max are each numbers.
As to your specific use-case: I would suggest filtering out the data points that fall outside the range you want to ignore. filter docs.
Your filter might look something like this (this will drop all points outside of the extent you specified)
{
"transform": [
{
"filter": {
"not": {"field": "Actual", "range": [25, 45]} }
}
}
],
...
}
If you need [25, 45] to be dynamic, you can set them using an ExprRef instead of using a number, like this
"range": [
{"expr": "datum.Start"},
{"expr": "datum.End"}
]
I have this json schema
{
"name":"Pete"
"age":24,
"subjects":[
{
"name":"maths"
"grade":"A"
},
{
"name":"maths"
"grade":"B"
}
]
}
and I want to ingest this into a pinot table to run a query like
select age,subjects_grade,count(*) from table group by age,subjects_grade
Is there a way to do this in a pinot job?
Pinot has two ways to handle JSON records:
1. Flatten the record during ingestion time:
In this case, we treat each nested field as a separated field, so need to:
Define those fields in the table schema
Define transform functions to flatten nested fields in table config
Please see how column subjects_name and subjects_grade is defined below. Since it's an array, so both fields are multi-value columns in Pinot.
2. Directly ingest JSON records
In this case, we treat each nested field as one single field, so need to:
Define the JSON field in table schema as a string with maxLength value
Put this field into noDictionaryColumns and jsonIndexColumns in table config
Define transform functions jsonFormat to stringify the JSON field in table config
Please see how column subjects_str is defined below.
Below is the sample table schema/config/query:
Sample Pinot Schema:
{
"metricFieldSpecs": [],
"dimensionFieldSpecs": [
{
"dataType": "STRING",
"name": "name"
},
{
"dataType": "LONG",
"name": "age"
},
{
"dataType": "STRING",
"name": "subjects_str"
},
{
"dataType": "STRING",
"name": "subjects_name",
"singleValueField": false
},
{
"dataType": "STRING",
"name": "subjects_grade",
"singleValueField": false
}
],
"dateTimeFieldSpecs": [],
"schemaName": "myTable"
}
Sample Table Config:
{
"tableName": "myTable",
"tableType": "OFFLINE",
"segmentsConfig": {
"segmentPushType": "APPEND",
"segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
"schemaName": "myTable",
"replication": "1"
},
"tenants": {},
"tableIndexConfig": {
"loadMode": "MMAP",
"invertedIndexColumns": [],
"noDictionaryColumns": [
"subjects_str"
],
"jsonIndexColumns": [
"subjects_str"
]
},
"metadata": {
"customConfigs": {}
},
"ingestionConfig": {
"batchIngestionConfig": {
"segmentIngestionType": "APPEND",
"segmentIngestionFrequency": "DAILY",
"batchConfigMaps": [],
"segmentNameSpec": {},
"pushSpec": {}
},
"transformConfigs": [
{
"columnName": "subjects_str",
"transformFunction": "jsonFormat(subjects)"
},
{
"columnName": "subjects_name",
"transformFunction": "jsonPathArray(subjects, '$.[*].name')"
},
{
"columnName": "subjects_grade",
"transformFunction": "jsonPathArray(subjects, '$.[*].grade')"
}
]
}
}
Sample Query:
select age, subjects_grade, count(*) from myTable GROUP BY age, subjects_grade
select age, json_extract_scalar(subjects_str, '$.[*].grade', 'STRING') as subjects_grade, count(*) from myTable GROUP BY age, subjects_grade
Comparing both ways, we recommend solution 1 to flatten the nested fields out when the field density is high(e.g. every document has field name and grade, then it's worth extracting them out to be new columns), it gives better query performance and better storage efficiency.
For solution 2, it's simpler in configuration, and good for sparse fields(e.g. only a few documents have certain fields). It requires to use json_extract_scalar function to access the nested field.
Please also note the behavior of Pinot GROUP BY on multi-value columns.
More references:
Pinot Column Transformation
Pinot JSON Functions
Pinot JSON Index
Pinot Multi-value Functions
Goal:
I have a JSON payload with the following format:
{
"Values": [
{
"Details": {
"14342": {
"2016-06-07T00:00:00": {
"Value": 99.62,
"Count": 7186
},
"2016-06-08T00:00:00": {
"Value": 99.73,
"Count": 7492
}
},
"14362": {
"2016-06-07T00:00:00": {
"Value": 97.55,
"Count": 1879
},
"2016-06-08T00:00:00": {
"Value": 92.68,
"Count": 355
}
}
},
"Key": "query5570027",
"Total": 0.0
},
{
"Details": {
"14342": {
"2016-06-07T00:00:00": {
"Value": 0.0,
"Count": 1018
},
"2016-06-08T00:00:00": {
"Value": 0.0,
"Count": 1227
}
}
},
"Key": "query4004194",
"Total": 0.0
}
],
"LatencyInMinute": 0.0
}
I want to load this in PowerBI and produce a table like so:
Notice how each Value + Count pair has its own row and some elements are repeated.
Problem: When I try to do this in Power BI (via Power Query), I get three initial columns, one of which is Details. Trouble is that I can expand Details, but I just get more columns, where what I really want is rows. I tried transpose, pivoting columns, and such but nothing helped. My troubles are exacerbated by Power Query treating the nested data elements as column names.
Question: Is there a way, in M, to convert this nested JSON payload to the table example I illustrated above?
Chris Webb wrote a recursive function to expand all table-type columns - I've managed to clone it for record-type columns:
https://gist.github.com/Mike-Honey/0a252edf66c3c486b69b
If you use Record.FromList for the expansion it should work.
You can find an example in the script here: https://chris.koester.io/wp-content/uploads/2016/04/TransformJsonArrayWithPowerQueryImkeFeldmann.txt
If an article has several comments (think thousands over time). Should data.relationships.comments return with a limit?
{
"data": [
{
"type": "articles",
"id": 1,
"attributes": {
"title": "Some title",
},
"relationships": {
"comments": {
"links": {
"related": "https://www.foo.com/api/v1/articles/1/comments"
},
"data": [
{ "type": "comment", "id": "1" }
...
{ "type": "comment", "id": "2000" }
]
}
}
}
],
"included": [
{
"type": "comments",
"id": 1,
"attributes": {
"body": "Lorem ipusm",
}
},
.....
{
"type": "comments",
"id": 2000,
"attributes": {
"body": "Lorem ipusm",
}
},
]
}
This starts to feel concerning, when you think of compound documents (http://jsonapi.org/format/#document-compound-documents). Which means, the included section will list all comments as well, making the JSON payload quite large.
If you want to limit the number of records you get at a time from a long list use pagination (JSON API spec).
I would load the comments separately with store.query (ember docs), like so -
store.query('comments', { author_id: <author_id>, page: 3 });
which will return the relevant subset of comments.
If you don't initially want to make two requests per author, you could include the first 'page' in the authors request as you're doing now.
You may also want to look into an addon like Ember Infinity (untested), which will provide an infinite scrolling list and automatically make pagination requests.