Our current architecture for reporting and dashboarding is similar to the following:
[Sql Azure] <-> [Azure Analysis Services (AAS)] <-> [Power BI]
We have almost 30 Power BI Pro Licenses (no Premium Tier)
As we migrate our on-premise data feeds to ADLS Gen2 with Data Factory and Databricks (in the long run, we will dismiss SQL Azure DBs), we investigate how to connect Power BI to the delta tables.
Several approaches suggest using SQL Databricks endpoints for this purpose:
https://www.youtube.com/watch?v=lkI4QZ6FKbI&t=604s
IMHO this is nice as long as you have a few reports. What if you have, say, 20-30? Is there a middle layer between ADLS Gen2 delta tables and Power BI for a scalable and efficient tabular model? How to define measures, calculated tables, manage relationships efficiently without the hassle of doing this from scratch in every single .pbix?
[ADLS Gen2] <-> [?] <-> [Power BI]
As far as I can tell, no AAS Direct Query is allowed in this scenario:
https://learn.microsoft.com/en-us/azure/analysis-services/analysis-services-datasource
Is there a workaround to avoid the use of Azure Synapse Analytics? We are not using it, and I am afraid we will not include it in the roadmap.
Thanks in advance for your invaluable piece of advice
Is there a middle layer between ADLS Gen2 delta tables and Power BI for a scalable and efficient tabular model?
If you want to build Power BI Import Models from Delta tables without routing through Databricks SQL or Spark, you can look into the new Delta Sharing Connector for Power BI. Or run a Spark job to export the model data to a Data Lake format that Power BI/AAS can read directly.
If you want DirectQuery models, Synapse SQL Pool or Synapse Serverless would be the path, as these expose the data as SQL Server endpoints, for which Power BI and AAS support DirectQuery.
How to define measures, calculated tables, manage relationships efficiently without the hassle of doing this from scratch in every single .pbix?
Define them in an AAS Tabular Model or a Power BI Shared Data Set.
Related
Just looking for a pointer as to the best way to go about this.
I'm comfortable with Power BI Report Builder (SSRS experience), but am pretty much a Power BI novice.
Basically, we have to create a Paginated (non-interactive) report for client consumption. It's going to be large, have multiple datasets, and use parameters / presence of data in the data sets to group data and/or turn sections on or off.
Not too much visualisation - some illustrative graphs and tables here and there - and quite a bit of text, some of it with data / text inserted via placeholders from the various datasets.
There are 3 Azure SQL databases I need to combine data from for this, (split roughly into config, data and results).
In SSRS / SQL Server, I would have used one of my databases as the data source, and written a stored procedure per SSRS data set, joining to tables in other databases in the stored procedure query.
Then in Report builder just set up the data sets joining to the stored procs and gone from there.
On Azure SQL Server, I think I've got 2 options:
write elastic queries so I can bring in the data I need from each database, but just query on one database.
Build a Power BI Model / Dataset that joins the relevant tables from the 3 databases together, publish to power bi service and use that as my datasource.
What's the best solution for my reporting scenario?
Cheers
In my scenario, Databricks is performing read and writing transformations in Delta tables. We have PBI connected to the Databricks cluster that needs to be running most of the time, which is expensive.
Knowing that delta tables are in a container, what would be the best way in terms of cost x performance to feed PBI from delta tables?
If your set size is under max allowed size in PowerBI (100 GB I guess) and daily refresh is enough you can just load everything to your PowerBI model.
https://blog.gbrueckl.at/2021/01/reading-delta-lake-tables-natively-in-powerbi/
If you want to save the costs maybe you don't need transactions and can save it in csv in data lake, than loading everything to PowerBI and refresh daily is really easy.
If you want to save the costs and query new incoming data all the time using DirectQuery consider using Azure SQL. It has really competitive prices starting from 5 eur/usd. Integration with databricks is also perfect write in append mode do all magic.
Another option to consider is to create an Azure Synapse workspace and use serverless SQL compute to query the delta lake files. This is a pay-per-the-TB consumed pricing model so you don’t have to have your Databricks cluster running all the time. It’s a great way to load Power BI import models.
As far as I know, deploying a Power BI report from Power BI Desktop results in two items, the report itself and the dataset. When deploying a new report using the same dataset, will deploy the new report and a second copy of the same dataset in Power BI Service. That is not what I wanted. To not confuse end users and other, I want only an unique dataset deployed.
I want to make use of Azure Devops deploying to Power BI Service in a Dev, Test and Prod way. The dataset will be an azure analysis services data model, but the principle should be the same. I need to reduce the dataset to be exactly one and all reports must relate to that data model. I have heard of a Rest API or powershell scripting that can come to a rescue here.
So if any of you have done this or know of a good article that describes how to do this, I would be grateful.
Regards Geir
The best option is to separate the Power BI report in the frontend and the backend. You create a file purely for the dataset if you are importing, no reports created on it. You can then create the reports, using the service connection to the dataset, or with Power BI desktop, in the connection to Power BI Dataset option. Both will use 'Live Connection' mode, so you cannot add any other data sources to the model, for example bring in a CSV file or SQL database.
If you are connecting to an Azure Analysis Service data model, you can use this approach, however as it is only a connection only, not a full fat dataset, it should not be an issue to have copies of the dataset, as it is just the connection. Having copies of the dataset is only an issue if you are importing data, then it is best to move things to data flows, and use the same front/back end method, and the planning around the scheduling of the dataflows then datasets
You can use the REST API to move reports and the datasets that they connect to, and move items to new workspaces. If you have Power BI Premium that has a life cycle tool to move items between dev/test/live workspaces
If you create a report in desktop and choose 'Power BI Dataset' as live connection to work over it - when you upload the report to the same workspace, it will only upload the report and connect to the same dataset
https://radacad.com/power-bi-shared-datasets-what-is-it-how-does-it-work-and-why-should-you-care#:~:text=A%20shared%20dataset%20is%20a%20dataset%20that%20shared%20between%20multiple,tenant%20in%20Power%20BI%20environment.
I have a collection of .pbix models that follow a similar structure, ie, have the same tables and relationships.
It is too complex to combine them all into a single .pbix.
Is there a way to upload all these tables into a single repository, like PBI Service dataflows or a data warehouse, or something similar.
And then get the data back to PBI Desktop and perform DAX calculations, visualizations and report.
Any suggestions/ ideas?
Thank you so much for helping!
You can publish them to Power BI Service, and then create separate reports, but using these published datasets as a data source.
See Connect to datasets in the Power BI service from Power BI Desktop.
After publishing your "model" reports to Power BI Online, start making a new blank report, but instead of getting the data from files/databases/etc., choose Power BI service as a data source and select the previously published dataset. After that, you can publish your report the same way, but in this case you can share one dataset (your model) between multiple reports.
I am connecting the Power BI to Azure Data Lake store with multiple files representing multiple tables.
1)
update It is currently loading the data into the Power BI file.update
But, can I have live connection from Power BI Desktop with the Azure Data Lake Store?
2)
Can I load multiple files to represent Dimensions and Fact tables.
From desktop you can access the Azure Data Lake Store data source - just make sure you're using a recent version of Power BI Desktop Data Lake Store - Power BI.
You can join multiple queries together in Power BI desktop
The documentation for PowerBI does not list Azure Data Lake Store as a source that can be connected live as of May 2018: https://learn.microsoft.com/en-us/power-bi/refresh-data#live-connections-and-directquery-to-on-premises-data-sources.
Alternatively, you could try using Azure Stream Analytics to create a job that can copy data, and connect to the live stream, but that process might need to be manually triggered and requires data movement, which might not be ideal for your scenario. https://learn.microsoft.com/en-us/azure/data-lake-store/data-lake-store-stream-analytics
Hope this helps.
Yes, you can get data from Azure Data Lake Store from Power BI desktop application:
Also, you can join multiple queries from different tables into PBI desktop app.