powerbi directquery distinct count group by issue - powerbi

I have a table format with
record date(including seconds), user ID, Database
I need to show the maximum distinct number of users per hour grouped by every 5 minutes ( not sure if this explains - please see example below)
I am only using DirectQuery storage and not intended to change that to import as well.
I have tried various methods but could not manage without changing the storage mode. Any help is appreciated.
My table sample is,
21/01/2019 12:35:00, jane, UK
21/01/2019 12:35:00, joe, UK
21/01/2019 12:35:00, joe, NL
21/01/2019 12:40:00, bob, NL
21/01/2019 12:40:00, jane, NL
21/01/2019 12:40:00, joe, NL
21/01/2019 12:40:00, jakob, NL
Expected result
21/01/2019 12, UK, 2
21/01/2019 12, NL, 4

Start by adding an hour column in the data table.
Hour = HOUR([DateTime])
Then create a temporary summarized table (this table can be hidden from the report view).
SummTemp =
SUMMARIZE(
Data;
[Datetime];
[Nat];
[Hour]; // this is the calculated column with only hour.
"Date"; DATE(YEAR([DateTime]); MONTH([DateTime]); DAY([DateTime]));
"Count"; COUNTA('Data'[Nat])
)
Then create a second summarized table:
SummaryTable =
SELECTCOLUMNS(
SUMMARIZE(
SummTemp;
[Date];
[Hour];
[Nat];
"MaxCount"; MAX('SummTemp'[Count]);
"NewName"; [Date]&" "&[Hour]
// OR "NewName"; ( [Date]&" "&TIME([Hour]; 0; 0))*
);
"DateHour"; [NewName];
"Nationality"; [Nat];
"MaxCNX"; [MaxCount]
)
The end result will look like this:
*If you use this line, then you will get a column in text format but with [dd/mm/yyyy hh:00:00]. You can convert this column to a datetieme format by selecting it and change it from text to datetime via Modeling->Data type. NMot sure if you can drop both minutes and seconds from it but you can drop seconds so that your DateHour column will have a format like: [dd/mm/yyyy hh:00]

Related

Active users on a given date in a Month in Power BI

I am working to get cumulative distinct count of uids on daily basis. My dataset consists dates and UserIDs active on that date. Example : Say there are 2 uids (235,2354) appeared on date 2022-01-01 and they also appeared on next day with new uid 125 (235,2354,125) on 2022-01-02 At this point i want store cumulative count to be 3 not 5 as (user id 235 and 2354 already appeared on past day ).
My Sample Data looks like as follows:
https://github.com/manish-tripathi/Datasets/blob/main/Sample%20Data.xlsx
enter image description here
and my output should look as follows:
enter image description here
Here's one way that seems to work, using your linked Excel sheet as the data source.
Create a new table:
Table 2 = DISTINCT('Table'[Date])
Add the columns:
MAU = CALCULATE(
DISTINCTCOUNT('Table'[User ID]),
'Table'[Date] <= EARLIER('Table 2'[Date]))
DAU = CALCULATE(DISTINCTCOUNT('Table'[User ID]),
'Table'[Date] = EARLIER('Table 2'[Date]))
Result from your Excel data

Power BI - Show zero/null value on line chart with dates

I'm using a simple table like this to make a report in Power BI:
Order number
Order date
Turnover
001
30/1/2022
10 €
002
30/1/2022
20 €
003
2/2/2022
15 €
I need to create a line chart showing all the dates, even where I have no data (no orders for that day). This is currently how is shown:
You can notice that the 1/2/22 and 3/2/22 are missing due to no order, but I want them to be visible and the value should be 0. This is also affecting the average line because it's calculated based on the days with data, but I need to put into account also the 0-turnover days.
I tried to use the "Show items with no data" on the date dimension and switch the X axis from Continuous to Catergorical and the other way around. I also tried to create a new metric like this:
Total Turnover = IF(ISBLANK(SUM(Orders[Turnover (EUR)])), 1, SUM(Orders[Turnover (EUR)]))
but it's not working.
If I understand your business requirement correctly, you are going to need to do three things:
Make sure you have a date-dimension table in your model. Build the relationship based on your [Order date] column.
Refactor your [Total Turnover] measure as such:
Total Turnover =
VAR TotalTurnover = SUM( Orders[Turnover] )
RETURN
IF(
ISBLANK( TotalTurnover ),
0,
TotalTurnover
)
Build your line chart using the [Date] column from your date table.

Power BI - Select Slicer Date Between 2 Columns

Hopefully a quick explanation of what I am hoping to accomplish followed by the approach we've been working on for over a year.
Desired Result
I have a table of SCD values with two columns, SCD_Valid_From and SCD_Valid_To. Is there a way to join a date table in my model (or simply use a slicer without a join) in order to be able to choose a specific date that is in between the two SCD columns and have that row of data returned?
Original Table
ID | SCD_Valid_From | SCR_Valid_To | Cost
1 2020-08-01 2020-08-03 5.00
Slicer date chosen is 2020-08-02. I would like this ID=1 record to be returned.
What We've Attempted So Far
We had a consultant come in and help us get Power BI launched last year. His solution was to create an expansion table that would contain a row for every ID/Date combination.
Expanded Original Table
ID | SCD_Valid_Date | Cost
1 2020-08-01 5.00
1 2020-08-02 5.00
1 2020-08-03 5.00
This was happening originally on the Power BI side, and we would use incremental refresh to control how much of this table was getting pushed each day. Long story short, this was extremely inefficient and made the refresh too slow to be effective - for 5 years' worth of data, we would need over 2000 rows per ID just to be able to select a dimensional record.
Is there a way to use a slicer where Power BI can select the records where that selected date falls between dates in two columns of a table?
Let me explain a workaround and I hope this will help you to solve your issue. Let me guess you have below 2 tables-
"Dates" table with column "Date" from where you are generating the date slicer.
"your_main_table" with with column "scd_valid_from" and "scd_valid_to".
Step-1: If you do not have relation between table "Dates" and "your_main_table", this is fine as other wise you have to create a new table like "Dates2". For this work around, you can not have relation between those tables.
In case you have already relation established between those tables, create a new custom table with this below code-
Dates2 =
SELECTCOLUMNS(
Dates,
"Date", Dates[Date]
)
From here, I will consider "Dates2" as source of your Date slicer. But if you have "Date" table with no relation with table "your_main_table", just consider "Dates" in place of "Dates2" in below measures creation. Now, Create these following 4 measures in your table "your_main_table"
1.
date_from_current_row = max(join_using_date_range[SCD_Valid_From])
2.
date_to_current_row = max(join_using_date_range[SCD_Valid_to])
3.
date_selected_in_slicer = SELECTEDVALUE(Dates2[Date])
4.
show_hide_row =
if(
[date_selected_in_slicer] >= [date_from_current_row]
&& [date_selected_in_slicer] <= [date_to_current_row]
,
1,
0
)
Now you have all instruments ready for play. Create your visual using columns from the table "your_main_table"
Final Step: Now just add a visual level filter with the measure "show_hide_row" and set value will show only when "show_hide_row = 1".
The final output will be something like below image-

Pivot with dynamic DATE columns

I have a query that I created from a table.
example:
select
pkey,
trunc (createdformat) business_date,
regexp_substr (statistics, 'business_ \ w *') business_statistics
from business_data
where statistics like '% business_%'
group by regexp_substr(statistics, 'business_\w*'), trunc(createdformat)
This works great thanks to your help.
Now I want to show that in a crosstab / pivot.
That means in the first column are the "business_statistics", the column headings are the "dynamic days from business_date".
I've tried the following, but it doesn't quite work yet
SELECT *
FROM (
select
pkey,
trunc(createdformat) business_date,
regexp_substr(statistics, 'business_\w*') business_statistics
from business_data
where statistics like '%business_%'
)
PIVOT(
count(pkey)
FOR business_date
IN ('17.06.2020','18.06.2020')
)
ORDER BY business_statistics
If I specify the date, like here 17.06.2020 and 18.06.2020 it works. 3 columns (Business_Statistic, 17.06.2020, 18.06.2020). But from column 2 it should be dynamic. That means he should show me the days (date) that are also included in the query / table. So that is the result of X columns (Business_Statistics, Date1, Date2, Date3, Date4, ....). Dynamic based on the table data.
For example, this does not work:
...
IN (SELECT DISTINCT trunc(createdformat) FROM BUSINESS_DATA WHERE statistics like '%business_%' order by trunc(createdformat))
...
The pivot clause doesn't work with dynamic values.
But there are some workarounds discuss here: How to Convert Rows to Columns and Back Again with SQL (Aka PIVOT and UNPIVOT)
You may find one workaround that suits your requirements.
Unfortunately, I am not very familiar with PL / SQL. But could I still process the start date and the end date of the user for the query?
For example, the user enters the APEX environment as StartDate: June 17, 2020 and as EndDate: June 20, 2020.
Then the daily difference is calculated in the PL / SQL query, then a variable is filled with the value of the entered period using Loop.
Example: (Just an idea, I'm not that fit in PL / SQL yet)
DECLARE
startdate := :P9999_StartDate 'Example 17.06.2020
enddate := P9999_EndDate 'Example 20.06.2020
BEGIN
LOOP 'From the startdate to the enddate day
businessdate := businessdate .... 'Example: 17.06.2020,18.06.2020,19.06.2020, ...
END LOOP
SELECT *
FROM (
select
pkey,
trunc(createdformat) business_date,
regexp_substr(statistics, 'business_\w*') business_statistics
from business_data
where statistics like '%business_%'
)
PIVOT(
count(pkey)
FOR business_date
IN (businessdate)
)
ORDER BY business_statistics
END;
That would be my idea, but I fail to implement it. Is that possible? I hope you understand what I mean

Dax Finding date based on Criteria calculated Column

I couldn't find an answer for my issue elsewhere, so trying my luck here.
I have sales table and my final result should determine if there were sales made for same person in specific period of time, for example within 7 business days. for example:
For ID 123 I have to flag it that sale for products A,B,C where within specified period.
For ID 1234 only sales of products A and B meet the criteria product C was sold in irrelevant time frame.
I've created a date table with indicators that determine for each date if the date is a working day, but i am struggling to calculate the relevant last working day
For example: I need that for 01/01/2019 i will get 10/01/2019 date, based on NUMOFDAYS and FinalWorkday = TRUE, which basically means that i have to count NUMOFDAYS times TRUE statement for each date and return corresponding date.
After that step done I think that it would be pretty easy for me to determine if the sale for a person was made in specific time frame.
If someone can give me direction for that much appreciated
Thank you in advance
You could use a DateTable like this one:
I used the following DAX-expressions for the calculated columns:
nrDays = 7
isWorkDay = WEEKDAY('DateTable'[Date],2) < 6
rankWorkingDays = RANKX ( FILTER ( DateTable, DateTable[isWorkDay] = TRUE () ),
DateTable[Date] , , ASC )
LastWorkDay = LOOKUPVALUE ( DateTable[Date],
DateTable[isWorkDay], TRUE (),
DateTable[rankWorkingDays], DateTable[rankWorkingDays] + DateTable[nrDays])
This issue can be solved by the following, since three are non-working days/holidays we can filter them out via POWERQUERY, than add an Index Column and Another column Which is basically Index column + Number of days wanted, then simply merge duplicate of dates query on Index+number of days wanted column on Index column. And we get the correct date