Most efficient Snowflake connection type from PowerBI? - powerbi

We're trialling PowerBI on a Snowflake dimensional model and performance seems very non-optimised. Can anyone point me to information on best practices for this connection? I've previously used Tableau and there's an excellent white paper describing the pros/cons of each connection type and how to set this up so that as much heavy lifting as possible is done in Snowflake, with minimal load on the viz tool.
e.g. when you summarise 1 million invoices to get a chart of sales volume by year that distils this to 10 data points, Tableu would send 'SELECT year, sum(volume) FROM t GROUP BY year' (~10 rows), but in PowerBI we see SF receiving a query like 'SELECT invoice_id, sum(volume) FROM t GROUP BY invoice_id' (~1M rows) - leaving the viz tool to do a lot more work.
So far, we've tried mapping the individual facts and dimensions within PowerBI, and also using a mix of direct query and import, but without significant improvement. Is there any guidance on best practice?
Thanks in advance!

I've never used Snowflake, and I have no clue about how PowerBi interfaces to it. That said on the PowerBI side you may be interested in the composite model and aggregations.
MS Docs:
https://learn.microsoft.com/en-us/power-bi/desktop-composite-models
https://learn.microsoft.com/en-us/power-bi/desktop-storage-mode
https://learn.microsoft.com/en-us/power-bi/desktop-aggregations
Radacad's blog about aggregations:
https://radacad.com/power-bi-fast-and-furious-with-aggregations
https://radacad.com/dual-storage-mode-the-most-important-configuration-for-aggregations-step-2-power-bi-aggregations
In practice, when you are using a composite model the aggregation functionality allows you to create a hidden table (in import mode) in your model with aggregated data (by year, month, customer, etc).
Now when you query your data, PowerBI will check if this table can answer the query, if yes then it will just pick the data from this table, otherwise, it will run a query against the source (direct query)
The example you shared about PowerBI querying the source without asking for aggregation (but instead asking for every single InvoiceId) might be caused by not setting up the composite model correctly.
A table in "direct query" cannot reference other tables in its query (in this case the calendar) unless that table is also in "Direct query" or "dual" mode.
How does the model look like in the case you shared? and which is the storage mode of each table?

Related

Filtering by Date and changing measure

I'm trying to build a report in a table that returns a count of completed checklists that are filtered by 2 Date Intelligence slicers. I have the targets for that month in a table, but I'm not sure how to change the measure by what's selected in the slicer.
I would like the measure to return the Target.Monthname of the selected slicers from the table Targets. The slicer is based on the DonesafeFolder Table with "Date of Completion"
Your question is a bit hard to understand.
But from your pictures, I think you should take a step back.
when using powerBI it is recommended to use a star schema, with facts and dimensions tables. Usually you model that when you import your data (with power query).
When using a star schema you also have a date dimension with the month on the rows instead of columns and powerBI is better at handling data structured like that. to lean about designing the data model you can read MS's guide to star schemas
Of cause, if you know what you are doing, you can use your approach, but it makes things alot harder.
BTW. you can't access the values in the slicers/filters directly. they filters the column you select which are then passed to the measures you create. If your datamodel is sound they should indirectly filter your measure.

What is difference between edit performed in query edit vs during modelling?

When I get data into Power BI I can edit the query as well as perform edit to the model.
What is difference between edit performed in query edit vs during modelling?
When you edit the query, you use Power Query, with its own Query Editor user interface. The steps you apply are recorded in the "M" language. Use Power Query to extract, transform, and finally load data into the Data Model.
Once the data is in the Data Model, you use DAX to create measures that you use in visuals. You can also use DAX to add more columns or even tables to the data model.
Whether to use Power Query or DAX to add columns or tables to the data model depends on a variety of factors. Some things are dead easy to do in Power Query, but harder to achieve with DAX, and vice versa. If you create a column with a formula that depends on a DAX measure, then you can only do that with DAX, because Power Query is not aware of the measures that are created after the load into the data model.
Power Query is very powerful, but the M code syntax is very different to the Excel formula syntax, or the VBA macro language. Learning to write advanced M code can be quite challenging.
DAX, on the other hand, behaves very similar to Excel formulas. Many Excel functions can even be used in DAX verbatim. If you know Excel, you've already got a head start on DAX and you can ease your way into it by learning additional functions and then expanding into more complex formulas.
The latter is probably the reason why many data manipulations are done in DAX, even though they could as well have been done in Power Query.
There are also some efficiencies with data storage and performance. Power Query makes use of query folding with SQL queries, for example, where its transformations are actually performed at the data source, i.e. on the SQL server side, and not in desktop client, and only the final query result is transferred to the desktop client.
Edit after comment: When the data is loaded into the data model, an algorithm processes the data and sorts it in a way that is most efficient for maximum compression and minimum storage. I don't have any concreate examples, but adding a column in Power Query will result in a smaller footprint than adding the same column with DAX. Read more about the compression algorithm VertiPaq here: https://towardsdatascience.com/inside-vertipaq-in-power-bi-compress-for-success-68b888d9d463
But apart from that, it mainly comes down to personal preference based on skill and experience.
By the way, many of your questions can be answered by reading through the Microsoft documentation, e.g. https://learn.microsoft.com/en-us/power-bi/guidance/import-modeling-data-reduction

PowerBI slicer value passing to query

I have a report generated from a SQL query, having a due date column. My requirement is to create a slicer and  whatever the date a user selects in the slicer the report should show all the data where due date is less than the selected slicer date.
I am not able to pass the slicer date to my SQL query. 
Can you guide me guys in finding the best possible way?
This is not possible in general. Slicers and filters set on a report page cannot modify the model (e.g. calculated tables or calculated columns) and cannot modify the queries.
The only possible way to do this sort of thing is with a DirectQuery, which does it automatically in the background since it dynamically queries only the needed data. Otherwise, you need to pre-load all of the data that you intend to use in the report.
Using DirectQuery has significant limitations and may or may not work for your use case. Please check the limitations and considerations in the linked documentation for details.

PowerBI reports run slowly in DirectQuery mode

I have a powerbi report for finance. Users need to see the latest data in real-time, so I have to choose DirectQuery. But in this mode, some functions, such as DateAdd and DatesMtd, cannot be used
(This DAX function is not supported for use in DirectQuery mode.),
So I need to write a very complex SQL statement to achieve the equivalent effect, but this makes the report very slow (more than 10 seconds) every time it runs, and the largest table in my data model is less than 80000 rows. I've tried to optimize the SQL statements, but it doesn't help. Any solution?
(I use powerbi report server with sqlserver enterprise version)
Of course, with no information, I can't know what's taking so much time, but in order to understand what's happening you can use the following tools:
PowerBI Performance Analyzer: This will tell you what part is taking the most time. for more info see MSDocs & SQLBI
Check the datamodel and the storage mode of each table involved (ie: fact table, calendar, customer, etc). When querying the source, PBI won't use filters (directly in the query) that come from tables in import mode. (search for "composite models" on the web)
Limit the number of objects, for each object in the dashboard a query will be sent to the datasource, limiting the number of objects might help. (remember that objects wait for each other, so one slow loading object might cause your problem)
(even if you probably already did it) Have a look at the query execution plan, you can also check it for queries automatically created by PowerBI by capturing them (the easiest way is to use SQL Server Profiler)
I think that just by using PowerBI Performance Analyzer you will be able to see where the problem is, and then do more accurate search about it.
You need to search for those keywords;
Native query in power query: Some M language functions can directly be translated to SQL, so that all transformation happens in sql server side.
Aggregated tables in model view: aggregated views can be added for specific needs of visuals. Ex: if a visual has product category and amount as value you can connect aggregated sql table to the original one so that visual can pick up the value faster.
Hybrid tables: import mode and DQ mode can be used together. so you can use DQ for daily data and import mode for older data together.

Power BI Aggregations - Detail Tables Must Be DirectQuery Tables?

I have a simple data model, from the Contoso database, that looks like this:
I'm trying to set up the table named Online Sales Aggregate as an aggregate table. When I attempt to set up a mapping, all the detail tables are disabled (see below)
When I hover over a table I see a message that says, "Customers (for example) must be a DirectQuery table to be used as the detail table."
All the tables in the model, including the Online Sales Aggregate table were imported. Why do the detail tables need to be DQ tables?
This is currently a limitation that Microsoft has imposed at least while aggregates are still in preview.
From Microsoft's documentation:
Detail table must be DirectQuery, not Import.
According to Microsoft people, it's likely that this limitation will eventually go away.
v-lili6-msft: Power bi product team is improving this preview feature
JoshCaplan-MSFT: This is still a work in progress but it is coming.
To expand on what David says below, I'd guess that removing this limitation is not a high priority since the main use case for aggregations is for datasets that are too unwieldy to import. If you've already imported all the data, then adding an aggregate table probably won't really speed things up that much in most cases.
If you still do need an aggregate table for an imported table, then you can do the workaround he describes by creating a summarized table via the query editor or a DAX calculated table and write your measure(s) to try to read from that first. An added bonus with this method is that you can use custom measures in your summarized table instead of being limited to aggregate summarization functions (Count, GroupBy, Max, Min, Sum), though you'll need to be careful with how you handle non-additive measures.