PowerBi - Connection Type (DIRECT QUERY or IMPORT DATA) Question - powerbi

I am working on a PowerBi project and I need some advice/questions on the best way to approach this project. I am tasked to create a dashboard for employee metrics pulled from an onsite SQL Server database. The managers here are going to have access to the PowerBi cloud, so I will end up uploading this to the cloud. There are 10 or so metrics that need to be shown on the dashboard. We have 5000+ employees. My first thought was to create a table and dump all the metrics into a table and set the PowerBi report to import the data, but that seems excessive and a waste of space to upload all that data to the CLOUD because all of the managers don't need access to every employee. They may want to see 1 or 2 employees' metrics on the dashboard.
My second thought is to (and if this is possible) create a stored procedure that will take the employee id and output a dataset for PowerBi to create a visual for. On the dashboard, have a list of employees and when a manager selects one, PowerBi will call the stored procedure with the employee id and the dataset will be returned for PowerBi to decipher into a visual based on my measurements. I guess I would set the PowerBi report connection type as DIRECT QUERY?
Here are my questions:
Is this possible? Is it possible to what I am thinking for my second plan? Is this how DIRECT QUERY works?
If so, how does DIRECT QUERY work with the PowerBi cloud?
What is setup like? Do I just install the PowerBi Data Gateway/configure it like IMPORT DATA and PowerBi does the rest?

A couple of queries:
What is the frequency of data update ?
In case if it is a batch job, it is ideally preferable to import that data from source into powerbi model and do reporting on the imported data as
a) The performance would be quicker
b) There would be no to and for of data across on prem database and cloud
c) the source would not be impacted constantly
So is the ask to have RLS wherein the managers should see only the employees under them?
Then it is pretty easy to implement RLS in imported version rather than in case of direct query.
Also you won't be able to pass parameters to stored procedures, and you can't execute them in direct query mode. You can however, create table valued functions which give you the ability to use table variables and perform other functions that are more complex in nature in Direct Query mode
you can refer this for additional details :
https://community.powerbi.com/t5/Desktop/Can-i-call-Stored-Procedure-with-Direct-Query/m-p/267141#:~:text=%40Pallavi%20you%20won't%20be,nature%20in%20Direct%20Query%20mode.

Related

Power BI Embedded Approach for 100s of SQL Targets

I'm trying to find the best approach to delivering a BI solution to 400+ customers which each have their own database.
I've got PowerBI Embedded working using service principal licensing and I have the PowerBI service connected to my data through the On Premise Data Gateway.
I've build my first report pointing to 1 of the customer databases. Which works lovely.
What I want to do next, when embedding the report, is to tell PowerBI, for this session, to get the database from a different database.
I'm struggling to find somewhere where this is explained, or to understand if this is even possible.
I'm trying to avoid creating 400+ WorkSpaces or 400+ Data Sets.
If someone could point me in the right direction, it would be appreciated.
You can configure the report to use parameters and these parameters can be used to configure the source for your dataset:
https://www.phdata.io/blog/how-to-parameterize-data-sources-power-bi/
These parameters can be set by the app hosting the embedded report:
https://learn.microsoft.com/en-us/rest/api/power-bi/datasets/update-parameters-in-group
Because the app is setting the parameter, each user will only see their own data. Since this will be a live connection, you would need to think about how the underlying server can support the workload.
An alternative solution would be to consolidate the customer databases into a single database (just the relevant tables) and use row level security to restrict access for each customer. The advantage to this design is that you take the burden off of the underlying SQL instance and push it into a PBI dataset that is made to handle huge datasets with sub-second response times.
More on that here: https://learn.microsoft.com/en-us/power-bi/enterprise/service-admin-rls

Create PowerBI Datamart from Azure Analysis Service

I am trying to create PowerBI Datamart from Azure Analyis service. There is a datamodel available in the Azure Analysis Service and I can connect using URL and Database Name. The datamodel has ~100 tables present in it and relationship also setup. So my question is, if I want to create a PowerBI datamart from the Azure Analyis service datamode, I need to do the Get Data option of PowerBI datamart and connect to Azure Analyis service, select table, select fields 100 time for getting all the tables of Azure Analyis service datamode into my PowerBI datamart? Is there any import function available where I can import all the tables in a single time?
Why do you want to copy data from AAS into a database?
The reason you find it difficult is that it's an odd thing to do. The query designer for AAS/SSAS generates MDX queries which are indented to run aggregate queries that return a handful of rows, and are wholly unsuitable for extracting whole tables. If you try, the queries will just run forever and fail.
It's possible to extract data from AAS/SSAS tabular models, but you must use DAX not MDX, and so you need to the Power Query or "Transform Data" window, and use the advanced editor.
Each query to load a table should look like this, eg to load the 'Customer' table:
let
Dax = "evaluate Customer",
Source = AnalysisServices.Database("asazure://southcentralus.asazure.windows.net/myserver", "mydatabase", [Query=Dax])
in
Source

Power BI and Shared datasets how to allow users to create new measures and reports and publish

We are having difficulty finding a method of sharing a dataset and allowing users to use that dataset to create and publish their own reports. This would include ability to create new measures (Dax) and then publish themselves. Using the "service" live connection does not seem to allow that and if not using that there seems to be an issue of refreshing the data once that dataset is downloaded and modified with new columns/measures etc. 
Greatly appreciate any help on this. So far I have seen nothing that shows how to do any of this so I have to assume it may not be possible? Thank you. 
Live Connect to a Power BI Dataset allows for local measures.
If you need more modeling changes when working with a remote Data Set, the DirectQuery for Power BI Datasets and AAS feature (currently in preview) enables you to mash-up remote Data Set tables, with local tables, and allows for adding calculated columns to remote tables.
But you should use this with some care, as the query processing is split between the local model and the remote model(s), which can cause performance issues.

How can I add a last refreshed date visual to my PowerBI report showing the last refreshed time of a data model in Azure Analysis Services

Context
I have a PowerBI report held in PowerBI services whose visuals are created using a data model held in an Azure Analysis Services instance (the data model itself is a database inside the azure analysis services instance). The report is using a live connection dataset to the data model which is held in analysis services.
The data model itself has been deployed using Visual Studio to the Analysis Services instance.
The refreshes of the data model are performed using a function app which has functions which refresh the latest 3 daywise partitions. Currently in order to view the last refresh time we go to the Azure Analysis Services instance on Azure Portal and check the "Date Modified" timestamp field.
As part of my reading I came across the following articles which explain ways in which you can add "last refreshed date" to a PowerBI report (in the form of a visual). However none of these articles specifically mention the effect of having the data model stored in Azure Analysis Services and if this would work in this case.
last refresh date & time from SSAS Tabular cube
Automatically adding date for last refresh of data
Display Last Refreshed Date in Power BI - The Excelguru Blog
How to Add the Last Refreshed Date and Time to a Power BI Report
https://askgarth.com/blog/how-to-display-version-number-info-on-power-bi-reports/
Question
What's the best way to bring in a "last refreshed datetime" visual into PowerBI if the
underlying data model is inside Azure Analysis Services and how can this be done?
Do these above methods seem plausible and or would these need adapting in my case?
Perhaps a different approach would be required?
Yes, adding a calculated column (or calculated table) to your AAS model that stores the NOW() or UTCNOW() function is the correct solution to display the refresh time.
Calculated columns and tables are calculated on refresh and stored in the model.

Optimize data load from Azure Cosmos DB to Power BI

Currently we have a problem with loading data when updating the report data with respect to the DB, since it has too many records and it takes forever to load all the data. The issue is how can I load only the data from the last year to avoid taking so long to load everything. As I see, trying to connect to the COSMO DB in the box allows me to place an SQL query, but I don't know how to do it in this type of non-relational database.
Example
Power BI has an incremental refresh feature. You should be able to refresh the current year only.
If that still doesn’t meet expectations I would look at a preview feature called Azure Synapse Link which automatically pulls all Cosmos DB updates out into analytical storage you can query much faster in Azure Synapse Analytics in order to refresh Power BI faster.
Depending on the volume of the data you will hit a number of issues. First is you may exceed your RU limit, slowing down the extraction of the data from CosmosDB. The second issue will be the transforming of the data from JSON format to a structured format.
I would try to write a query to specify the fields and items that you need. That will reduce the time of processing and getting the data.
For SQL queries it will be some thing like
SELECT * FROM c WHERE c.partitionEntity = 'guid'
For more information on the CosmosDB SQL API syntax please see here to get you started.
You can use the query window in Azure to run the SQL commands, or Azure Storage Explorer to test the query, then move it to Power BI.
What is highly recommended is to extract the data into a place where is can be transformed into a strcutured format like a table or csv file.
For example use Azure Databricks to extract, then turn the JSON format into a table formatted object.
You do have the option of using running Databricks notebook queries in CosmosDB, or Azure DataBricks in its own instance. One other option would to use change feed to send the data and an Azure Function to send and shred the data to Blob Storage and query it from there, using Power BI, DataBricks, Azure SQL Database etc.
In the Source of your Query, you can make a select based on the CosmosDB _ts system property, like:
Query ="SELECT * FROM XYZ AS t WHERE t._ts > 1609455599"
In this case, 1609455599 is the timestamp which corresponds to 31.12.2020, 23:59:59. So, only data from 2021 will be selected.