Incremental refresh not working as expected when dimension is created from a fact table using duplicate table/remove duplicate rows/index col/merge - powerbi

I have a large table (Sales) in Dw with following columns:
CREATE TABLE [dbo].[sales](
[id] [int] NULL,
[amt] [int] NULL,
[date] [datetime] NULL,
[customername] [nvarchar](50) NULL
) ON [PRIMARY]
GO
INSERT [dbo].[sales] ([id], [amt], [date], [customername]) VALUES (1, 1000, CAST(N'2022-04-01T00:00:00.000' AS DateTime), N'Customer A')
GO
INSERT [dbo].[sales] ([id], [amt], [date], [customername]) VALUES (2, 2000, CAST(N'2022-04-10T00:00:00.000' AS DateTime), N'Customer B')
GO
INSERT [dbo].[sales] ([id], [amt], [date], [customername]) VALUES (3, 3000, CAST(N'2022-04-15T00:00:00.000' AS DateTime), N'Customer A')
GO
Incremental refresh is configured on this table.
To improve this model, I want to move the CustomerName to its own new dimension table (say Customer). To achieve this, suppose I duplicate this fact table, then keep only the CustomerName column, remove duplicates, add an index column. Then merge the SalesFact table with DimCustomer.
Incremental refresh configured for sale table is shown below:
Configure the same incremental refresh for customer table.
Place 2 table visuals on the report - 1 for sale table and other for customer table to see the output.
Now I deploy this report to the Power BI portal.
Add following 2 records in the source table (SQL server):
4, 4000, yerterday's date, N'Customer A'
5, 5000, yerterday's date, N'Customer C'
Now refresh the report in the Power BI portal for the 1st time. This breaks the Power BI refresh (it doesn't complete). However, the underlying queries suggests that Power BI applies folding on both tables.
Now take off the incremental refresh from the customer table. And publish the report. Refresh it. This time the refresh works.
But the fact table gets wrong ID (2) for the Customer C record, and in the customer table the Customer C doesn't show up.
So can we not implement incremental refresh when making dimensions from a table using Power BI transforms?

Related

power bi load rows based on condition in Query editor

I have the blow data
create table #tbl (Continet varchar(40), Country varchar(40), City varchar(40), OrderDate date)
insert into #tbl values
('South America', 'Colombia', 'Angostura', '2020-05-13'),
('Europe','Germany','Thuringia','2019-02-12'),
('Asia','China','Tianjin','2017-11-22'),
('Asia','India ','Hyderabad','2018-02-15'),
('Asia','China','Tianjin','2016-10-30'),
('Europe','United Kingdom', 'Northwich','2015-05-03')
select * from #tbl
drop table #tbl
I am taking the data to power BI and the rows are in millions.
what i want is to Filter the data in Power BI query editor to load only India Data. The data is from google analytics, for simplicity i use sql to give the above data. How can i do it in power query editor
Thanks
Assuming that you know what a sql view is.
Make an sql view -with a proper where- and feed power-bi with its result.

Power BI - Select Slicer Date Between 2 Columns

Hopefully a quick explanation of what I am hoping to accomplish followed by the approach we've been working on for over a year.
Desired Result
I have a table of SCD values with two columns, SCD_Valid_From and SCD_Valid_To. Is there a way to join a date table in my model (or simply use a slicer without a join) in order to be able to choose a specific date that is in between the two SCD columns and have that row of data returned?
Original Table
ID | SCD_Valid_From | SCR_Valid_To | Cost
1 2020-08-01 2020-08-03 5.00
Slicer date chosen is 2020-08-02. I would like this ID=1 record to be returned.
What We've Attempted So Far
We had a consultant come in and help us get Power BI launched last year. His solution was to create an expansion table that would contain a row for every ID/Date combination.
Expanded Original Table
ID | SCD_Valid_Date | Cost
1 2020-08-01 5.00
1 2020-08-02 5.00
1 2020-08-03 5.00
This was happening originally on the Power BI side, and we would use incremental refresh to control how much of this table was getting pushed each day. Long story short, this was extremely inefficient and made the refresh too slow to be effective - for 5 years' worth of data, we would need over 2000 rows per ID just to be able to select a dimensional record.
Is there a way to use a slicer where Power BI can select the records where that selected date falls between dates in two columns of a table?
Let me explain a workaround and I hope this will help you to solve your issue. Let me guess you have below 2 tables-
"Dates" table with column "Date" from where you are generating the date slicer.
"your_main_table" with with column "scd_valid_from" and "scd_valid_to".
Step-1: If you do not have relation between table "Dates" and "your_main_table", this is fine as other wise you have to create a new table like "Dates2". For this work around, you can not have relation between those tables.
In case you have already relation established between those tables, create a new custom table with this below code-
Dates2 =
SELECTCOLUMNS(
Dates,
"Date", Dates[Date]
)
From here, I will consider "Dates2" as source of your Date slicer. But if you have "Date" table with no relation with table "your_main_table", just consider "Dates" in place of "Dates2" in below measures creation. Now, Create these following 4 measures in your table "your_main_table"
1.
date_from_current_row = max(join_using_date_range[SCD_Valid_From])
2.
date_to_current_row = max(join_using_date_range[SCD_Valid_to])
3.
date_selected_in_slicer = SELECTEDVALUE(Dates2[Date])
4.
show_hide_row =
if(
[date_selected_in_slicer] >= [date_from_current_row]
&& [date_selected_in_slicer] <= [date_to_current_row]
,
1,
0
)
Now you have all instruments ready for play. Create your visual using columns from the table "your_main_table"
Final Step: Now just add a visual level filter with the measure "show_hide_row" and set value will show only when "show_hide_row = 1".
The final output will be something like below image-

How to limit calendar table on a page level in Power BI

I have my calendar table that goes beyond today's date. That's what I need for the report.
But on one of the pages on my report I need calendar table only on or before today's date.
I tried to filter calendar table but seems like there is no way it can be dynamically filtered as of TODAY()
Is any way to limit calendar table until particular date?
Create a calculated column to flag your desired values using the following pseudo DAX:
IsBeforeToday :=
SWITCH ( TRUE (), 'Calendar'[Date] <= NOW (), 1, 0 )
Then, use this as a visual level filter in your report, setting the value to one.

Previous period calculation in Azure Data Explorer\Kusto

We are building a report in Power BI on data sitting in Azure Data Explore r.Because we need the report to be fully dynamic we cannot pre-write the queries but must rely on Power BI to generate the queries for Data Explorer according to the user actions on the report.
One of our requirements is to show several measures compared to their value on previous period (month). The measure must also be very dynamic and so the correct value must be based on the user filters and actions and can't be pre-calculated.
We added a calculated measure in Power BI:
Prev_Month_Amt=CALCULATE(SUM(sales[Amt]),DATEADD(dates[Record_DT],-1,MONTH))
The dates table contains one row per day and is linked to the sales table in Power BI using Many-to-one relationship. The sales table includes several hundred millions records.
The problem is that when we add the Prev_Month_Amt measure to a Power BI object like Matrix we encountered very long run times and quite often "ge Accumulated string array getting too large" errors.
Is there a better way to build previous period calcualtions in Power BI based on Azure Data Explorer?
Thanks,
H.G.
You can add the previous month amount column to the Kusto table presented to PBI (either by adding it to a real table using update policy or Microsoft flow, or by extending it in a stored function). The PBI will see it as a regular column, here is an example:
let T = datatable(Amount:double, Day:datetime, LineItem:string, Account:string)
[2, datetime(2019-01-03), "revenue", "a",
2, datetime(2019-01-05), "revenue", "a",
5, datetime(2019-01-03), "revenue", "b",
5, datetime(2019-01-05), "revenue", "b",
10, datetime(2019-02-07), "revenue", "a",
2, datetime(2019-02-10), "revenue", "a",
3, datetime(2019-02-10), "revenue", "b",
4, datetime(2019-02-10), "revenue", "b"
];
T
| extend Month = startofmonth(Day)
| summarize Amount = sum(Amount) by Month, LineItem, Account
| join kind=leftouter
(
T
| extend Month = startofmonth(endofmonth(Day) + 1d) // sets the current month to the next month
| summarize LastMonthAmount= sum(Amount) by Month, LineItem, Account
) on Month, LineItem, Account
| project Month, LineItem, Account, Amount, LastMonthAmount

How to add rows to a new custom table based on values in other table?

I need to add duplicate entries in a new custom table from the inputs given in another table.
e.g. consider a table with columns
Subscriber
Effective_date
Expiriation_date
Service_Code
Now i want to dynamically create a custom table which will have duplicate entries with the same
Subscriber
Effective_date
Expiriation_date
Service_Code
I also need new columns for the dates between "Effective_date" & "Expiriation_date", so I could filter and see how many months a customer was subscribed for until leaving.
Right now I have this:
but instead of 3 - which is the number of month until the customer left, I want to see the months between:
May
June
July
how can I do this?
If I understand what you are asking for, this should give you what you want.
First, you will need a table full of dates. If you don't already have one, you can have PowerBI create one for you by clicking on 'Modeling' -> 'New Table' and then entering in this formula (you can alter the dates to expand or shrink the dates included in the table).
Dates = CALENDAR(DATE(2016, 1, 1), DATE(2017, 12, 31))
Then create a new table with this formula. This will give you a new table with rows for each date between the 'Effective_Date' and 'Expiration_Date' from the original table.
ServicePeriodExpanded = FILTER(CROSSJOIN(ServicePeriod, 'Dates'), ServicePeriod[Effective_Date] <= 'Dates'[Date] && IF(ServicePeriod[Expiration_Date] = BLANK(), TODAY(), ServicePeriod[Expiration_Date]) >= 'Dates'[Date])
Here are the results.