I have two tables:
A calendar table with both dates and hours.
And a table that contains incidents, with start time and end time in addition to set of attributes (only one, incident_code, in the example table).
For a specific date range, I want to show how many incidents occurred each hour.
The following measure works, but struggles when the incident table becomes large and there is several slicers
incidentCnt =
CALCULATE(
COUNTROWS(incidents);
FILTER(
incidents;
incidents[start_datetime] < MAX(date_hour[datetime]) &&
incidents[end_datetime] > MAX(date_hour[datetime])
)
)
What would be a more efficient way to calculate this with DAX in Power BI?
EDIT: This expression will work with the data presented above (minor edits to the accepted answer).
Expanded_Incidents =
GENERATE(
incidents;
SELECTCOLUMNS(
GENERATESERIES(
DATE(
YEAR(incidents[start_datetime]);
MONTH(incidents[start_datetime]);
DAY(incidents[start_datetime])
) + TIME(HOUR(incidents[start_datetime]);0;0);
incidents[end_datetime];
TIME(1;0;0)
);
"Expanded_incident_time"; [Value]
)
)
A more effectiv way would be to expand the incidents table so that each incident occurs for each time between the start and end datetime. This you can do with a calculated table like this:
Expanded_Incidents =
GENERATE(
incidents;
SELECTCOLUMNS(
GENERATESERIES(
incidents[start_date];
incidents[end_date];
TIME(h;m;s)
);
"Expanded_incident_time"; [Value]
)
)
Change the TIME(h;m;s) to the time resolution of your choosing.
Then make a relationship between your dateTime dim table and the [Expanded_incident_time] column (this should be a 1:*). Then you only have to make a measure which counts the number of rows in the new table. This will give you the number of active incidents during a specific time and I believe it will be faster than reitterating over the table each time.
Related
My intention is to populate days of the month to simulate a data warehouse periodic snapshot table using DAX measures. My goal is to show non-additive values for the quantity.
Consider the following transactions:
The granularity of my snapshot table is day. So it should show the following:
Take note that a day may have multiple entries but I am only interested in the latest entry for the day. If I am looking at the figures using a week period it should show the latest entry for the week. It all depends on the context fixter.
However after applying the measure I end up with:
There are three transactions. Two on day 2 and the other on day 4. Instead of calculating a running total I want to show the latest Qty for the days which have no transactions without running accumulating totals. So, day 4 should show 4 instead of summing up day 3 and day 4 which gives me 10. I've been experimenting with LASTNONBLANK without much success.
This is the measure I'm using:
Snapshot =
CALCULATE(
SUM('Inventory'[Quantity]),
FILTER(
ALL ( 'Date'[Date] ),
'Date'[Date] <= MAX( 'Date'[Date] )
)
)
There are two tables involved:
Table # 1: Inventory table containing the transactions. It includes the product id, the date/time the transaction was recorded and the quantity.
Table # 2: A date table 'Date' which has been marked as a date table in Power BI. There is a relationship between the Inventory and the Date table based on a date key. So, in the measure, 'Date'[Date] refers to the Date column in the Date table.
You can use the LASTNONBLANKVALUE function, that returns the last value of the expression specified as second parameter sorted by the column specified as first parameter.
Since LASTNONBLANKVALUE implicitly wraps the second parameter into a CALCULATE, a context transition happens and therefore the row context is transformed into the corresponding filter context. So we also need to use VALUES to apply the filter context to the T[Qty] column. The returned table is a single row column and DAX can automatically convert a single column, single row table to a scalar value.
Then, since we don't have a dimension table we have to get rid of cross-filtering, therefore we must use REMOVEFILTERS over the whole table.
the filter expression T[Day] < MaxDay is needed because LASTNONBLANKVALUE must be called in a filter context containing all the rows preceding and including the current one.
So, assuming that the table name is T with fields Day and Qty like in your sample data, this code should work
Edit: changed in order to support multiple rows with same day, assuming the desired result is the sum of the last day quantities
Measure =
VAR MaxDay =
MAX ( T[Day] )
RETURN
CALCULATE (
LASTNONBLANKVALUE (
T[Day],
SUM ( T[Qty] )
),
T[Day] <= MaxDay,
REMOVEFILTERS ( T )
) + 0
Edit: after reading the comments, this might work on your model (untested)
Measure =
VAR MaxDay =
MAX ( 'Date'[Date] )
RETURN
CALCULATE (
LASTNONBLANKVALUE (
Inventory[RecordedDate],
SUM ( Inventory[Quantity] )
),
'Date'[Date] <= MaxDay
) + 0
I want to create a table based on input table.
Input table is:
The new table filters the input table to show the last entry of every day.
I have tried working with measure but sometimes cant tell if it is working right until I graph it in pivot tables which is not so bad but sometimes just doesn't show me what I need to see exactly.
I have tried this measure:
History_Daily Efficiency =
VAR LastDailyEfficiency =
GENERATE(
VALUES ('Table_Full'[Cell]),
CALCULATETABLE (
TOPN (
1,
GROUPBY (
'Table_Full',
'Table_Full'[Date],
'Table_Full'[Time],
'Table_Full'[Efficiency]
),
'Table_Full'[Date], DESC,
'Table_Full'[Time], DESC,
'Table_Full'[Efficiency], ASC
)
)
)
RETURN
CALCULATE (
AVERAGE('Table_Full'[Efficiency]),
TREATAS( LastDailyEfficiency, 'Table_Full'[Cell], 'Table_Full'[Date], 'Table_Full'[Time], 'Table_Full'[Efficiency]),
'Table_Full'[Efficiency] < 80
)
But I got this:
I would like to see this as the output:
You can create a new table:
LastDayCount = GROUPBY(Table_Full;Table_Full[lob/Part Number];Table_Full[Date];"LastDate";MAXX(CURRENTGROUP(); Table_Full[DateTime]))
This will create a table with the last DateTime of the day.
Next we add a column giving us the max of that particular last datetime of the day. I noticed that you have more the same entries, the logic below takes the max part count at the end of the day when more than one entry.
Count =
CALCULATE(MAX(Table_Full[Part Count]);
FILTER(Table_Full;LastDayCount[Table_Full_lob/Part Number] = Table_Full[lob/Part Number]
&& LastDayCount[LastDate] = Table_Full[DateTime]))
End result:
I have an identified date within a fact table and I need to create a new table that's formatted "MMM-YY" (JAN-01) and sorted in month order.
I entered the data manually from a Excel Spreadsheet with dates ranging from Jan-15 to Dec-30 and sorted by an id num but ideally I want it to update monthly based on a identified date column within the fact table in BI. So for now id want dates upto Sep-19 which would update when the identified dates are refreshed within the fact table.
For now I've used the below to format the column which was inserted manually however id like to change the monthref.date to be based off of an identified date within my fact table.
ID Date = FORMAT('Month Ref'[Date],"MMM-YY")
Id want my date within the fact table to be lookedup/refrenced in another table formatted "mmm-yy" and to be in order of calendar month.
I can use
CALENDAR(MIN(FactEfficiencies[identified_date].[Date]) AND MAX(FactEfficiencies[identified_date].[Date])) to get every date then format but it will bring back the month multiple times when I just need one.
You are already on the good direction with the calendar table. One thing that could help you achieve your result could be including a SUMMARIZE or GROUPBY function. Have a look at this example.
Calendar =
var _fullCalendar =
ADDCOLUMNS (
CALENDAR ( MIN ( 'Table'[Date] ) ; MAX ( 'Table'[Date] ) ) ;
"MonthYear" ; FORMAT ( [Date] ; "MMM-YY" )
)
RETURN
SUMMARIZE ( _fullCalendar ; [MonthYear] )
I tried create a dashboard based on fiscal year, with more Filters, like region, sales rep name, ...
Example files avaliable on dropbox:
https://www.dropbox.com/sh/l25kdz6enmg35yb/AABPuOk3kKOpfQdKDfRUcnX2a?dl=0
On my closest attempt, i tried this follow:
Add one column on my data set, naming each period as distinct number, like: "17";"18";"19", due to deslocated fiscal year (april to march).
Then create a measure:
PREVIOUS CROP_YEAR = SWITCH(TRUE();
SELECTEDVALUE('dataset'[Crop-X])=16;(CALCULATE(SUM('dataset'[Order Value]);ALL('dataset')));
SELECTEDVALUE('dataset'[Crop-X])=17;(CALCULATE(SUM('dataset'[Order Value]);ALL('dataset')));
SELECTEDVALUE('dataset'[Crop-X])=18;(CALCULATE(SUM('dataset'[Order Value]);ALL('dataset')));
SELECTEDVALUE('dataset'[Crop-X])=19;(CALCULATE(SUM('dataset'[Order Value]);ALL('dataset')));
0)
Expected output was:
Values based on all filters applied, But instead i just get an empty charts
The measure is return the total because you are explicitly asking for it by using the ALL function. This removes all the filters from the dataset thus returning a grand total. This can work but it creates a complexity in your dataset with respect of having two time dimensions. The way to solve this is to first make sure you filter the date correctly with respect to both dimensions
PREVIOUS YEAR =
CALCULATE(
SUM('dataset'[Order Value]);
FILTER(
ALL ( 'dataset' ) ;
AND (
'dataset'[Crop-X] = MAX('dataset'[Crop-X]) -1 ;
'dataset'[YEAR] = MAX('dataset'[YEAR] ) -1
)
)
)
Furthermore, this measure still uses the ALL function which means any other filters get ignored. Using ALLSELECTED instead would result in the relative time filtering to result in nothing as soon as you select any time based slicer in your dashboard, this prevents the filter from looking at any other part of the dataset that is not within the primary sliced dataset. The workaround would be to use ALLEXCEPT and add the filters you want to be able to use as arguments. Downside is that any filter you add to your dashboard will have to be added to the exception manually.
PREVIOUS YEAR =
CALCULATE(
SUM('dataset'[Order Value]);
FILTER(
ALLEXCEPT( 'dataset' ; Dim1[Group] ; Dim1[Manager] ; Dim1[Region] ) ;
AND (
'dataset'[Crop-X] = MAX('dataset'[Crop-X]) -1 ;
'dataset'[YEAR] = MAX('dataset'[YEAR] ) -1
)
)
)
I have following scenario which has been simplified a little:
Costs fact table:
date, project_key, costs €
Project dimension:
project_key, name, starting date, ending date
Date dimension:
date, years, months, weeks, etc
I would need to create a measure which would tell project duration of days using starting and ending dates from project dimension. The first challenge is that there isn't transactions for all days in the fact table. Project starting date might be 1st of January but first cost transaction is on fact table like 15th on January. So we still need to calculate the days between starting and ending date if on filter context.
So the second challenge is the filter context. User might want to view only February. So it project starting date is 1.6.2016 and ending date is 1.11.2016 and user wants to view only September it should display only 30 days.
The third challenge is to view days for multiple projects. So if user selects only single day it should view count for all of the projects in progress.
I'm thankful for any help which could lead towards the solution. So don't hesitate to ask more details if needed.
edit: Here is a picture to explain this better:
Update 7.2.2017
Still trying to create a single measure for this solution. Measure which user could use with only dates, projects or as it is. Separate calculated column for ongoing project counts per day would be easy solution but it would only filter by date table.
Update 9.2.2017
Thank you all for your efforts. As an end result I'm confident that calculations not based on fact table are quite tricky. For this specific case I ended up doing new table with CROSS JOIN on dates and project ids to fulfill all requirements. One option also was to add starting and ending dates as own lines to fact table with zero costs. The real solution also have more dimensions we need to take into consideration.
To get the expected result you have to create a calculated column and a measure, the calculated column lets count the number of projects in dates where projects were executed and the measure to count the number of days elapsed from [starting_date] and [ending_date] in each project taking in account filters.
The calculated column have to be created in the dim_date table using this expression:
Count of Projects =
SUMX (
FILTER (
project_dim,
[starting_date] <= EARLIER ( date_dim[date] )
&& [ending_date] >= EARLIER ( date_dim[date] )
),
1
)
The measure should be created in the project_dim table using this expression:
Duration (Days) =
DATEDIFF (
MAX ( MIN ( [starting_date] ), MIN ( date_dim[date] ) ),
MIN ( MAX ( [ending_date] ), MAX ( date_dim[date] ) ),
DAY
)
+ 1
The result you will get is something like this:
And this if you filter the week using an slicer or a filter on dim_date table
Update
Support for SSAS 2014 - DATEDIFF() is available in SSAS 2016.
First of all, it is important you realize you are measuring two different things but you want only one measure visible to your users. In the first Expected result you want to get the number of projects running in each date while in the Expected results 2 and 3 (in the OP) you want the days elapsed in each project taking in account filters on date_dim.
You can create a measure to wrap both measures in one and use HASONEFILTER to determine the context where each measure should run. Before continue with the wrapping measure check the below measure that replaces the measure posted above using DATEDIFF function which doesn't work in your environment.
After creating the previous calculated column that is required to determine the number of projects in each date, create a measure called Duration Measure, this measure won't be used by your users but lets us calculate the final measure.
Duration Measure = SUMX(FILTER (
date_dim,
date_dim[date] >= MIN ( project_dim[starting_date] )
&& date_dim[date] <= MAX ( project_dim[ending_date] )
),1
)
Now the final measure which your users should interact can be written like this:
Duration (Days) =
IF (
HASONEFILTER ( date_dim[date] ),
SUM ( date_dim[Count of Projects] ),
[Duration Measure]
)
This measure will determine the context and will return the right measure for the given context. So you can add the same measure for both tables and it will return the desired result.
Despite this solution is demonstrated in Power BI it works in Power Pivot too.
First I would create 2 relationships:
project_dim[project_key] => costs_fact[project_key]
date_dim[date] => costs_fact[date]
The Costs measure would be just: SUM ( costs_fact[costs] )
The Duration (days) measure needs a CALCULATE to change the filter context on the Date dimension. This is effectively calculating a relationship between project_dim and date_dim on the fly, based on the selected rows from both tables.
Duration (days) =
CALCULATE (
COUNTROWS ( date_dim ),
FILTER (
date_dim,
date_dim[date] >= MIN ( project_dim[starting_date] )
&& date_dim[date] <= MAX ( project_dim[ending_date] )
)
)
I suggest you to separate the measure Duration (days) into different calculated column/measure as they don't actually have the same meaning under different contexts.
First of all, create a one-to-many relationship between dates/costs and projects/costs. (Note the single cross filter direction or the filter context will be wrongly applied during calculation)
For the Expected result 1, I've created a calculated column in the date dimension called Project (days). It counts how many projects are in progress for a given day.
Project (days) =
COUNTROWS(
FILTER(
projects,
dates[date] >= projects[starting_date] &&
dates[date] <= projects[ending_date]
)
)
P.S. If you want to have aggregated results on weekly/monthly basis, you can further create a measure and aggregate Project (days).
For Expected result 2 and 3, the measure Duration (days) is as follows:
Duration (days) =
COUNTROWS(
FILTER(
dates,
dates[date] >= FIRSTDATE(projects[starting_date]) &&
dates[date] <= FIRSTDATE(projects[ending_date])
)
)
The result will be as expected: