Query a dataset with Power Bi REST APIs using a Service Principal - powerbi

Our goal is to query a dataset that is published to PowerBI via the REST APIs ( https://learn.microsoft.com/en-us/rest/api/power-bi/datasets/execute-queries ). I'm not talking about the metadata of the dataset, I mean the row-level data contained within the tables in the dataset.
We are going to write a service (probably on prem) that will need to query this data, format it, and push it to another system. From what we understood, we could use a service principal as the identity to query the PowerBI API and retrieve the data.
The very important factor in this, is the service principal should not have access to the row level data of any other dataset. If we have to separate the datasets in a different workspace, that is workable, but not preferred.

Service Principal can be used to access that PBI API. It will have access to the data only if it has authorization on that workspace. So you need to separate workspace in order to manage the access of the dataset.
Sample in postman
From my experience, PowerBI execute DAX query can be quite slow. So do keep that in mind if your integration will require a quick response of PBI API.

Related

Power BI Embedded Approach for 100s of SQL Targets

I'm trying to find the best approach to delivering a BI solution to 400+ customers which each have their own database.
I've got PowerBI Embedded working using service principal licensing and I have the PowerBI service connected to my data through the On Premise Data Gateway.
I've build my first report pointing to 1 of the customer databases. Which works lovely.
What I want to do next, when embedding the report, is to tell PowerBI, for this session, to get the database from a different database.
I'm struggling to find somewhere where this is explained, or to understand if this is even possible.
I'm trying to avoid creating 400+ WorkSpaces or 400+ Data Sets.
If someone could point me in the right direction, it would be appreciated.
You can configure the report to use parameters and these parameters can be used to configure the source for your dataset:
https://www.phdata.io/blog/how-to-parameterize-data-sources-power-bi/
These parameters can be set by the app hosting the embedded report:
https://learn.microsoft.com/en-us/rest/api/power-bi/datasets/update-parameters-in-group
Because the app is setting the parameter, each user will only see their own data. Since this will be a live connection, you would need to think about how the underlying server can support the workload.
An alternative solution would be to consolidate the customer databases into a single database (just the relevant tables) and use row level security to restrict access for each customer. The advantage to this design is that you take the burden off of the underlying SQL instance and push it into a PBI dataset that is made to handle huge datasets with sub-second response times.
More on that here: https://learn.microsoft.com/en-us/power-bi/enterprise/service-admin-rls

What does refreshing a Dataflow in the PBI service actually do?

In the PBI service, there is a refresh option for dataflows. What does a refresh operation for dataflows actually do?
A Power BI Dataflow is much like a data storage component on its own (internally using Azure Data Lake) and and a refresh will simply update data from the connected data source by applying all the predefined ETL steps.
The biggest advantage of Dataflows is that a Power BI Dataset can connect to more than one of them at a time so that you can define your ETL steps in one place only and feed the results into serveral datasets, avoiding code duplication.
Another advantage is probably that you can author your ETL code directly in the Online Service w/o a PBIDesktop.exe
When refreshing Datasets be aware that they do not trigger a refresh of the connected Dataflows. This has to be scheduled separately.
Dataflows are essentially the cloud version of M queries in Power Query / Query Editor. A Dataflow is the ETL layer that connects to the data sources, extracts and transforms the data, then stores the result as a table.
When you refresh a Datafow, it's just like refreshing a query in a Power BI model. It re-connects to the underlying data sources and pulls in the data from those sources as they exist at the time of refresh and stores the transformed data which can then be used in data models.
Things are a bit more complex with DirectQueries, linked tables, and incremental refreshes, which I'm choosing to ignore for the sake of simplicity.
Resources:
https://learn.microsoft.com/en-us/power-bi/transform-model/dataflows/dataflows-introduction-self-service
https://radacad.com/dataflow-vs-dataset-what-are-the-differences-of-these-two-power-bi-components

How to deploy Power BI reports and connect them to a single Power BI Dataset

As far as I know, deploying a Power BI report from Power BI Desktop results in two items, the report itself and the dataset. When deploying a new report using the same dataset, will deploy the new report and a second copy of the same dataset in Power BI Service. That is not what I wanted. To not confuse end users and other, I want only an unique dataset deployed.
I want to make use of Azure Devops deploying to Power BI Service in a Dev, Test and Prod way. The dataset will be an azure analysis services data model, but the principle should be the same. I need to reduce the dataset to be exactly one and all reports must relate to that data model. I have heard of a Rest API or powershell scripting that can come to a rescue here.
So if any of you have done this or know of a good article that describes how to do this, I would be grateful.
Regards Geir
The best option is to separate the Power BI report in the frontend and the backend. You create a file purely for the dataset if you are importing, no reports created on it. You can then create the reports, using the service connection to the dataset, or with Power BI desktop, in the connection to Power BI Dataset option. Both will use 'Live Connection' mode, so you cannot add any other data sources to the model, for example bring in a CSV file or SQL database.
If you are connecting to an Azure Analysis Service data model, you can use this approach, however as it is only a connection only, not a full fat dataset, it should not be an issue to have copies of the dataset, as it is just the connection. Having copies of the dataset is only an issue if you are importing data, then it is best to move things to data flows, and use the same front/back end method, and the planning around the scheduling of the dataflows then datasets
You can use the REST API to move reports and the datasets that they connect to, and move items to new workspaces. If you have Power BI Premium that has a life cycle tool to move items between dev/test/live workspaces
If you create a report in desktop and choose 'Power BI Dataset' as live connection to work over it - when you upload the report to the same workspace, it will only upload the report and connect to the same dataset
https://radacad.com/power-bi-shared-datasets-what-is-it-how-does-it-work-and-why-should-you-care#:~:text=A%20shared%20dataset%20is%20a%20dataset%20that%20shared%20between%20multiple,tenant%20in%20Power%20BI%20environment.

Want to take data from Power BI dataflow based on roles

I have created a Dataflow in power bi service. Now my client's requirements is that they want to take the data from the dataflow as per the roles. There is a user table where roles are already defined. My question is that without the relation between tables, how I am supposed to filter the data from all the tables? Is it possible at all? Or how can I make relationship of the tables in dataflow? Or any alternate procedure to take the data from dataflow as per the roles. Help me pls. Thanks in advance.
If your data supports it, for example some sort of mapping between the user and the data they are allowed to see, you will need to use Row Level Security to restrict what the end users see in the report. You will make the relationship between your dataflow and mapping table in Power BI, not the dataflow.
If you mean restricting access to the data in the dataflow based on their role, for example the user creates a report it only loads what they are allowed to see, then this functionality is not supported.
Hope that helps

How can I use a parameter in a MS Power Bi web data source string?

I have a URL that returns a json object with everything I need for my power bi embedded report. I get the data for the report by adding a new web data source and pasting the URL in. a few transformations later and tada! sexy report. the report shows lots of charts and graphs etc... however I need to be able to change the datasource URL depending on who is looking at it.
The report shows data for a single organization. You can only look at it if you're in that organization. how can I pass an organizations ID when embedding the report so that the datasource will show different data?
for example if my datasource is defined in the originating pbix as
Json.Document(Web.Contents("http://www.testdata.com/api/json?orgId=1"))
how can I change it to
Json.Document(Web.Contents("http://www.testdata.com/api/json?orgId=2"))
when I'm pull the report to embed on a page?
I know you can filter data but that means I have to make the datasource URL pull ALL the data which would be huge and intensive just to have bi filter out something.
In short, I'm embedding a report on a website and tat report's only way to get data is via a json endpoint. That endpoint requires the org id of the user so how do I pass it to bi which in turn uses it in the data source url?
Your only option for this scenario is to pull all the required data into your dataset. Then you can use either Role Level Security (RLS) or the new JS API to filter the data for each user.
You should probably look at an Azure SQL data source as a more efficient, flexible and scalable back-end for PBI Embedded.