PowerBI reports run slowly in DirectQuery mode - powerbi

I have a powerbi report for finance. Users need to see the latest data in real-time, so I have to choose DirectQuery. But in this mode, some functions, such as DateAdd and DatesMtd, cannot be used
(This DAX function is not supported for use in DirectQuery mode.),
So I need to write a very complex SQL statement to achieve the equivalent effect, but this makes the report very slow (more than 10 seconds) every time it runs, and the largest table in my data model is less than 80000 rows. I've tried to optimize the SQL statements, but it doesn't help. Any solution?
(I use powerbi report server with sqlserver enterprise version)

Of course, with no information, I can't know what's taking so much time, but in order to understand what's happening you can use the following tools:
PowerBI Performance Analyzer: This will tell you what part is taking the most time. for more info see MSDocs & SQLBI
Check the datamodel and the storage mode of each table involved (ie: fact table, calendar, customer, etc). When querying the source, PBI won't use filters (directly in the query) that come from tables in import mode. (search for "composite models" on the web)
Limit the number of objects, for each object in the dashboard a query will be sent to the datasource, limiting the number of objects might help. (remember that objects wait for each other, so one slow loading object might cause your problem)
(even if you probably already did it) Have a look at the query execution plan, you can also check it for queries automatically created by PowerBI by capturing them (the easiest way is to use SQL Server Profiler)
I think that just by using PowerBI Performance Analyzer you will be able to see where the problem is, and then do more accurate search about it.

You need to search for those keywords;
Native query in power query: Some M language functions can directly be translated to SQL, so that all transformation happens in sql server side.
Aggregated tables in model view: aggregated views can be added for specific needs of visuals. Ex: if a visual has product category and amount as value you can connect aggregated sql table to the original one so that visual can pick up the value faster.
Hybrid tables: import mode and DQ mode can be used together. so you can use DQ for daily data and import mode for older data together.

Related

Analyze in Excel from Power BI not showing hierarchies

I am using Snowflake as my backend database and created & published a dataset in Power BI with direct query. As a next step I am trying to analyze the data (to get Pivot experience) in excel.
I am observing the hierarchies I have created are not showing up in excel, though those are showing when accessing through PBI Service.
DirectQuery comes with a slew of limitations compared to imported datasets. The only hierarchy-specific limitation included in the official documentation is that Auto date-time hierarchies are not created for DQ datasets. However, this documentation is about direct limitations and doesn't specifically cover limitations that might only be applied to XMLA connections, which your connection from Excel is.
A workaround is to just use computed columns with the hierarchy values and name them like Category01, Category02, Category03 and do the nesting yourself. Users often have use cases that involve using hierarchies out of order (like grouping by Category03 THEN by Category01) and so consider it a feature rather than a flaw.

What is difference between edit performed in query edit vs during modelling?

When I get data into Power BI I can edit the query as well as perform edit to the model.
What is difference between edit performed in query edit vs during modelling?
When you edit the query, you use Power Query, with its own Query Editor user interface. The steps you apply are recorded in the "M" language. Use Power Query to extract, transform, and finally load data into the Data Model.
Once the data is in the Data Model, you use DAX to create measures that you use in visuals. You can also use DAX to add more columns or even tables to the data model.
Whether to use Power Query or DAX to add columns or tables to the data model depends on a variety of factors. Some things are dead easy to do in Power Query, but harder to achieve with DAX, and vice versa. If you create a column with a formula that depends on a DAX measure, then you can only do that with DAX, because Power Query is not aware of the measures that are created after the load into the data model.
Power Query is very powerful, but the M code syntax is very different to the Excel formula syntax, or the VBA macro language. Learning to write advanced M code can be quite challenging.
DAX, on the other hand, behaves very similar to Excel formulas. Many Excel functions can even be used in DAX verbatim. If you know Excel, you've already got a head start on DAX and you can ease your way into it by learning additional functions and then expanding into more complex formulas.
The latter is probably the reason why many data manipulations are done in DAX, even though they could as well have been done in Power Query.
There are also some efficiencies with data storage and performance. Power Query makes use of query folding with SQL queries, for example, where its transformations are actually performed at the data source, i.e. on the SQL server side, and not in desktop client, and only the final query result is transferred to the desktop client.
Edit after comment: When the data is loaded into the data model, an algorithm processes the data and sorts it in a way that is most efficient for maximum compression and minimum storage. I don't have any concreate examples, but adding a column in Power Query will result in a smaller footprint than adding the same column with DAX. Read more about the compression algorithm VertiPaq here: https://towardsdatascience.com/inside-vertipaq-in-power-bi-compress-for-success-68b888d9d463
But apart from that, it mainly comes down to personal preference based on skill and experience.
By the way, many of your questions can be answered by reading through the Microsoft documentation, e.g. https://learn.microsoft.com/en-us/power-bi/guidance/import-modeling-data-reduction

PowerBI Query Performance

I have a PowerBI report that has a few different pages display different visuals. The report uses the same table of data (lets call it Jobs).
The previous author of this report has created two queries in the data section that read off this base table of data, but apply different transformations and filters to the underlying data. Then, the visuals use either of these models to display their data. For example, the first one applies a filter to exclude certain columns based off a status field and the other applies a different filter, and performs transformations on some of the columns
When I manually refresh the report, it looks like the report is retrieving data for both of these queries, even though the base data is the same. Since the dataset is quite large, I am worried that this report has been built inefficiently but I am not sure if there is a better way of doing this.
TL;DR; The Source and Navigation of both of queries is exactly the same - is this retrieving the data twice and causing my report to be inefficient, and if so, what is the approrpiate way to achieve what I am trying to do?
PowerBi will try to parallelize as much as possible. If you have two queries that read from the same table then two queries will be executed.
To avoid this you can:
create a query which only gets the necessary data from the table.
Set this table not to be loaded in the model (toggle "Enable Load")
Every other table that starts from this table won't be a clone of this but will reference it.
In this way, the data will be fetched once from the source and then used to create other tables using PowerQuery.

Most efficient Snowflake connection type from PowerBI?

We're trialling PowerBI on a Snowflake dimensional model and performance seems very non-optimised. Can anyone point me to information on best practices for this connection? I've previously used Tableau and there's an excellent white paper describing the pros/cons of each connection type and how to set this up so that as much heavy lifting as possible is done in Snowflake, with minimal load on the viz tool.
e.g. when you summarise 1 million invoices to get a chart of sales volume by year that distils this to 10 data points, Tableu would send 'SELECT year, sum(volume) FROM t GROUP BY year' (~10 rows), but in PowerBI we see SF receiving a query like 'SELECT invoice_id, sum(volume) FROM t GROUP BY invoice_id' (~1M rows) - leaving the viz tool to do a lot more work.
So far, we've tried mapping the individual facts and dimensions within PowerBI, and also using a mix of direct query and import, but without significant improvement. Is there any guidance on best practice?
Thanks in advance!
I've never used Snowflake, and I have no clue about how PowerBi interfaces to it. That said on the PowerBI side you may be interested in the composite model and aggregations.
MS Docs:
https://learn.microsoft.com/en-us/power-bi/desktop-composite-models
https://learn.microsoft.com/en-us/power-bi/desktop-storage-mode
https://learn.microsoft.com/en-us/power-bi/desktop-aggregations
Radacad's blog about aggregations:
https://radacad.com/power-bi-fast-and-furious-with-aggregations
https://radacad.com/dual-storage-mode-the-most-important-configuration-for-aggregations-step-2-power-bi-aggregations
In practice, when you are using a composite model the aggregation functionality allows you to create a hidden table (in import mode) in your model with aggregated data (by year, month, customer, etc).
Now when you query your data, PowerBI will check if this table can answer the query, if yes then it will just pick the data from this table, otherwise, it will run a query against the source (direct query)
The example you shared about PowerBI querying the source without asking for aggregation (but instead asking for every single InvoiceId) might be caused by not setting up the composite model correctly.
A table in "direct query" cannot reference other tables in its query (in this case the calendar) unless that table is also in "Direct query" or "dual" mode.
How does the model look like in the case you shared? and which is the storage mode of each table?

Visual has exceeded the available resources. Tips on how to streamline / better understand limitations?

OK, so I have a relatively complex report that works well on the desktop app but is bombing out on the web portal. Apparently, it is requesting 1048584KB which is just shy of the 1048576KB limit.
This report is a matrix, built as follows:
It is connected to two primary data sources, along with some tertiary feeds and helper tables. One of these is a sales detail table that is a CSV 887MB in size. The other is a purchasing detail table that is an XLS 26MB in size.
I have filtered out portions of the sales table (by date) in the Edit Queries screen. I have also filtered out specific item divisions in the matrix. It is the second step that was allowing this visual to function previously (took out a few not-needed divisions and it started working again, but now this no longer seems to work).
I would like to not just get a quick answer here, but also to better understand how Power BI is allocating the memory and how I can streamline. The rest of the report is using the same data but this is the only visual that fails to load (aside from some tables that are on a line-level and are intended to be filtered down via slicers prior to displaying information). Will add that there are some relatively complex measures that are firing on this visual and not used anywhere else, presuming this has a lot to do with memory demands...right?