How to perform Incremental refresh when making a star schema using Power BI transforms?

How to perform Incremental refresh when making a star schema using Power BI transforms? - powerbi

I have a large table (FactSales) in Dw with following columns:
OrderDate|ProductID|OrderNumber|CustomerName|Amount
DateTable connects on date column to OrderDate
DimProduct connects on ProductID column to ProductID
Incremental refresh is configured on this table.
Now to improve this model in Power BI, I want to move the CustomerName to its own new dimension table (say DimCustomer). To achieve this, suppose I duplicate this fact table, then keep only the CustomerName column, remove duplicates, add an index column. Then merge the SalesFact table with DimCustomer.
At this point I'm unable to configure Incremental refresh for DimCustomer and FactSales table because the native query option is disabled (query doesn't fold).
So my intention to improve the model has the negative cost of not being able to incrementally refresh. How is this scenario (adding index column, and merge with another table) handled by Power BI engineers so that incremental refresh can be performed?

You just need to inject the RangeStart/RangeEnd parameters before the first non-foldable query step.

Related

Are there any challenges with Incremental refresh with independent table versus merged table?

I am practicing incremental refresh with independent and merged tables scenarios.
Say I have 2 tables - Sales (SalesID, ProductID, Date, Amount) and Rate (ProductID, Date, pc%)
End goal is that the Sales table must have 5th column = Sales.Amount * Rate.pc%
Incremental load needs configuring as there is large volume of data.
Scenario 1: Independent tables
I setup incremental refresh for both tables. In Power query added an extra column (newid) to both that is concatenation of ProductID+Date. In Power BI model connected the Rate table to Sales table on the newid column. To the Sales table I have added calculated column = Amount * RELATED(Rate,pc%)
I'm monitoring queries hitting the SQL server and notice that queries are sent for both tables to update their partitions. This is as expected.
Scenario 2: Merge tables in Power Query
Scenario2.Trial 1:
I setup incremental refresh for Sales table. In Power query I have merged the Rate table on ProductID and Date to retrieve the corresponding pc% value.
Observing the SQL queries there is a query for each partition of the Sales table, and the whole rate table is pulled at each iteration of the Sales partition.
Scenario2.Trial 2:
In addition to above, I setup incremental refresh on Rate table, and observed that the above behavior is the same (that is - for each partition of Sale table, the entire rate table is fetched). Plus queries are fired for rate table separately for its partitions.
Scenario2.Trial 3:
In addition to both the above, I disabled incremental load on the rate table, but kept the range start/end parameter filters on this table. And disabled load of the rate table. Now for each incremental load partition query of the Sales table, it also applies the same date filters to the rate table.
Are there any challenges to consider when using either of the above - that is - Scenario 1 or Scenario 3.Trial 3 when performing incremental refresh when the eventual goal is to merge the tables to retrieve column value from the merged table? The negative aspect of Scenario 2 is that - after the request to each Sales table partition, it fires a request for the corresponding rate table for that date range.

Filter multiple columns using single slicer in power bi

I am new in power bi. I am creating a basic tabular report in power bi. But the catch is I have multiple dates columns in the dataset (For Eg. productvalidfrom,productvalidto,ordervalidfrom,ordervalidto). And I want to filter these columns with a single date selection.
If I select 2021-09-01 then condition for filter will be
2021-09-01>=productvalidfrom and 2021-09-01<productvalidto and 2021-09-01>=ordervalidfrom and 2021-09-01<ordervalidto
and need to all the columns from the dataset. No summarization.
Thanks in advance.

I have been able to implement this before. You will need to add a data table to your data sources. Then you will want to create relationships between the dates in your data and the date table. Only one of them will be active relationships, the rest will be inactive. Then, your slicer will use the date from your newly created date table.
This link should be able to guide you through the build.

Can power bi create a scd from a table with no historical data?

Can creating a temporal table be done in Power bi instead of SQL? I want to import data from my organizations employee database(which overwrites changes so there is no historical data). Compare it in power bi to the table I currently have loaded from a month ago; if it is different, can power bi add a new record to show the SCD with new empl title and then date stamp it for that day?

Typically no unfortunately (although you could possibly fake it using dataset partitions, dataflows or programmatically adding records to what's called a "push datasets" - and one of these would be easy nor stable). Power BI assumes that all data will be purged on a full refresh.

How to pull the latest records from sql table to POwer BI dashboard.?

I am a beginer to Power BI
I have loaded an sql table to power BI dataset. The table keeps udating with new records frequently. How can I always pull the data of the latest five records into the dashboard. There as id no which keeps growing with the no of rows.
another question, how can I show the dashboard specific to the current working day only? There is date stamp in the sql data. How can i use it? With filter I am able to select a particular date but to set it to current day or last week etc.
Thanks in advance.

If your data doesn't contain any field which would tell you, what then latest rows are, the there's not much you can do here. Apparently you have some sort of timestamp there, so a query something like this could work as the datasource for your report:
SELECT TOP 5 field1, field2... FROM Table ORDER BY Timestamp DESC
To your second question, you can add a reletive date filter: https://learn.microsoft.com/en-us/power-bi/visuals/desktop-slicer-filter-date-range

Problems loading data in to Analysis Services Model

I’m building an model in Azure Analysis Services. The model should contain only data for the last 3 months and is processed every day.
I have a separate dimension for date that has a relation with a fact table using a datekey. I’m using a power query to only load the last 3 months in the date dimension. In the power query to load the fact table I used Table.nestedjoin to only load the rows that have a value in the date table.
When I do this, the processing of the model takes forever. After some troubleshooting I saw that the query Analysis Services is using to retrieve data from the SQL database retrieves all rows. So, Am I correct saying AS load all data before it merge the rows? Is there a way to change this? Or is there a better way to a chief my solution?
Kind regards,

Joins are super slow in Power Query. You should avoid them if you can do it in the datasource or use normal relationships in the data model.
Also, you can setup the date dimension in DAX and dynamically populate it to contain only dates present in the FACT table.
As for the load of all the data, it could be because the data is fetched as is, and only then power query applies the transformations (the join).

You can modify the query in the Power Query Editor / Advenced Editor to add a where clause direclty in the query

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js