Why does table refresh when it is part of merge even when the `include is report refresh` is disabled on the source table (source of merge)? - powerbi

I have 2 tables in Power query.
Sales
Rate
I have merged Rate into Sales and pulled in Rate.percentage column into the Sales table.
Then I unchecked the include in report refresh option on Rate table. My intention is that the Rate table should never refresh and Sales table merge to the Rate table should always consider the value as of the time of 1st refresh that I do when performing the 1st data load.
To test this I updated some rate percentage values in the Rate table in the db, and notice that in the report visual the values that come from the Rate table still shows old value (which I expected). However, the Rate.percentage column that is part of the Sales table (due to merge) shows updated values. Is this expected? and how to prevent this?
I also tried unchecking the enable load option. With this the Rate table cannot be used inn visual (expected), but the rate.percentage on the Sales table visuals show updated Rates (when I change the values in the db and hit refresh).

This is just the way PQ works. It doesn't store any data. If you have a table A that you are refreshing which is merging in values from Table B which you have told not to refresh, then PQ wouldn't be able to proceed. It needs to restream that data from Table B despite your instruction not to refresh so that it can refresh Table A.
That instruction to not refresh Table B only works if the table is used in isolation and not used in other queries.

Related

How to perform incremental refresh on merged table?

I have 2 tables in Power query editor.
I want to merge them and implement incremental load on merged table.
Following is my plan:
Merge both tables into a new table (Table3)
Disable refresh and disable load for both tables.
How to configure Incremental refresh on Table3?
Do I need to also configure Incremental refresh on Table1 and Table2?
So technically- will each table get incrementally loaded and then merge. Or will entire data be merged and then incrementally loaded?
For this to work you need to, in simple terms:
Create your limiting parameters RangeStart and RangeEnd
Set up a filter on applicable date columns using RangeStart and RangeEnd parameters for your subqueries Table 1 and Table 2 (this controls data ingestion)
Set up the same type logic for the applicable date column in Table 3 (this controls data deletion)
Configure incremental refresh time logic
For it to be actually efficient you also need to make sure:
Data is transactional in nature
Both subqueries are foldable and from the same data source
The resulting table is foldable
If the queries are not foldable, it will require a full data load and subsequent filter anyway, removing the benefits of incremental refresh.
There exists a nice write-up of this in the Power BI Community pages that details how you would go about setting this up for a header/detail table join.

DirectQuery - Very inefficient queries being generated to Snowflake

I am connecting PBI to Snowflake using DirectQuery. To keep it simple, I have two tables, a product dimension table and a sales fact table. There are 3.7M rows in the product dimension table and 100M in the sales fact table. I also have a measure that calculates total sales which uses SUM to sum a column in the fact table.
I create a table visual in PBI and put the product description as the first column. The query generated by PBI is good. It retrieves 501 rows and displays them. So far, so good. Next I put the total sales measure as the second column. Now PBI generates several queries retrieving 1,000,001 rows. Of course I get an error stating the 1M row limit for DirectQuery has been reached.
This should not be happening. Has anyone run into something like this? Is there anything I can do?
I had a dig around and there is a capability to adjust the limit if you have a premium license
https://powerbi.microsoft.com/en-gb/blog/five-new-power-bi-premium-capacity-settings-is-available-on-the-portal-preloaded-with-default-values-admin-can-review-and-override-the-defaults-with-their-preference-to-better-fence-their-capacity/

How to store a historical data of a calculated table during every refresh?

I have a a calculated table that gets refreshed during every refresh.
Weight = SUMMARIZE(filter(Sprints,Sprints[Delivery Name] in {"Retail Service","ATLAS DG"}),'Sprints'[Delivery Name],Sprints[Delivery Team],Sprints[Component],Sprints[Sprint Name],"Planned",[Planned Weight],"Mid",[Mid Weight],"End",[End Weight],"Total Sprint Score",[Total Sprint Score New])
I refresh my source dataset every week. Right now it just refreshes the values in the calculated table. I want to store last week's calculated table result set when I am doing this week refresh so that it will help me for my historical analysis. any ideas? thanks
I am afraid that is impossible. You should create a history table on the source database side.

Create a pivot table in Power BI with dates instead of aggregate values?

I have a table of companies with descriptive data about where we are in the sales stage with the company, and the date we entered that specific stage. As can be seen below, the stages are rows in a Process Step column
My objective is to pivot this column so each Process Step is a column, with a date below it, as shown in excel:
I tried to edit the query and pivot the column upon loading it, but for the "aggregate value" column, no matter which column I use as the values column, it results in some form of error
My advice would be not to pivot the table in the query and use measures to get dates that you want. The benefit of not doing so is that you are able to perform all sorts of other analytics. For instance, Sankey chart would be hard to do properly with pivoted table.
To get the pivot table you are showing from Excel, it's as simple as using matrix visual in Power BI and putting Client code in rows and Process Step in Columns, then Effective date in values.
If you need to perform calculations between stages, it's also not too difficult. For instance, you can create a meausure that shows only dates at certain stages, another measure for another stage, and so on. For example:
Date uploaded = CALCULATE(MAX(Table[Effective Date]), FILTER(Table, Table[Process Step] = "Upload"))
Date exported = CALCULATE(MAX(Table[Effective Date]), FILTER(Table, Table[Process Step] = "Export"))
Time upload to export = DATEDIFF([Date uploaded], [Date exported], DAY)
These measures will work in the context of client and assuming there is only one date for the upload step (but no Process step in rows or columns). If another scenario is needed, perhaps a different approach could be taken.
Please let me know if that solves your problem.

Power BI: How to merge two tables (loaded plus created table)

I'm struggling with what I assume is a calculated table in Power BI Desktop.
I need to somehow connect my database loaded Accounts table with a manually created Progress table (with some fixed data), so that each row in Accounts basically has a calculated column which is the resulting Progress table for that row. (Hope that makes sense).
[This is the Progress table1
The calculated columns in the Progress table should use data from the related Accounts row to give an overview of where the Account is now, how long it took and the likely future time frames to reach the next levels of progress.
Is there a way to do such a thing?
TIA!
Dennis
I think you have 2 options:
Add a column in the model layer using the DAX LOOKUPVALUE function: https://msdn.microsoft.com/en-us/library/gg492170.aspx
From the Edit Queries window, use the Merge Queries button, then Expand the resulting NewColumn.