DAX query for counting values based on another cumulative DAX measure - powerbi

I have this table:
Id Length(m) Defect Site Date
1 10 1 y 10/1/19
2 60 0 x 09/1/19
3 30 1 y 08/1/19
4 80 1 x 07/1/19
5 20 1 x 06/1/19
I want to count the amount of defects and ids that are in the last 100m of length(sorted by date DESC), whilst maintaining the ability for this to change with additional filters. For example, what are the amount of defects for site x in the last 100m, or what are the amount of defects in the last 100m that have an ID bigger than 1.
For the question 'What are the amount of defects for site x in the last 100m', I would like the result to be 2, as the table should look like this:
Id Length(m) Length Cum. Defect Site Date
4 80 80 1 x 07/1/19
5 20 100 1 x 06/1/19
I believe the issue in creating this query so far has been that I need to create a cumulative DAX query first and then base the counting query off of that DAX query.
Also important to note that the filtering will be undertaken in PowerBI. I don't want to hardcode filters in the DAX query.
Any help is welcome.

Allwright!
I have taken a crack at this. I did assume that the id of the items(?) increments through time, so the oldest item has the lowest id.
You were correct that we need to filter the table based on the cumulative sum of the meters. So I first add a virtual column to the table (CumulativeMeters) which I can then use to filter the table on. I need to break the filter context of the ADDCOLUMNS function to sum up the hours of multiple rows.
Important is to use ALLSELECTED to keep any external filters in place. After this it is pretty straightforward to filter the table on a maximum CumulativeMeters of <= 100 meters and where the row is a defect. Counting the rows in the resulting table gives you the result you are looking for:
# Defects last 100m =
CALCULATE (
COUNTROWS ( Items ),
FILTER (
ADDCOLUMNS (
Items,
"CumulativeMeters", CALCULATE (
SUM ( Items[Length(m)] ),
FILTER (
ALLSELECTED( Items ),
Items[Date] <= EARLIER ( Items[Date] )
&& Items[Id] <= EARLIER ( Items[Id] )
)
)
),
[CumulativeMeters] <= 100
&& Items[Defect] = 1
)
)
Hope that helps,
Jan

Related

Counting a conditional flag within a group over a subgroup in Power BI

I am looking for help in calculating a total by month, that counts how many applications met an aggregated criteria within that month.
A simplified version of the data is that we receive applications, and complete a number of tasks to process that application. Each task has a duration, and adding all the tasks for an application gives the total processing time for the application. We then want to know how many applications were processed in 10 days or less.
Note - the tasks to sum can vary, so we want to calculate the application total duration at the DAX layer, rather than in PowerQuery.
Example raw data:
Application
Task
Period
Duration
A
2859
Aug-22
2
A
2860
Aug-22
2
A
2861
Aug-22
1
B
1990
Aug-22
1
B
1991
Aug-22
20
C
9940
Sep-22
0
C
9941
Sep-22
27
D
1891
Sep-22
1
D
1892
Sep-22
8
E
3697
Sep-22
-
E
3698
Sep-22
26
The calculation condition would then look like this (we don't need to create this view, I'm just including it to explain the logic):
The actual output we're looking to create is a total by month (based on when application was received):
I've got a version of this working by creating a summary table in PowerQuery that makes a unique list of Application and Period combinations, and then links back to the main table to sum the duration for each, but I feel like it should be possible to do this using a DAX formula - as much for my own learning as anything else. Any suggestions on a) how to do it directly in DAX and b) whether this is sensible would be greatly appreciated!
You'll have to adjust the table and column names, but you can accomplish this logic with the following measures. The variable applicationsLessThan10 is a virtual table by Month and application and total duration for each. This is then filtered accordingly and we return the number of rows.
I split out month name into a new column with powerquery - ideally, you would have a date table which would account for year and month and the sorting of the month name.
LessThan10Days =
VAR applicationsLessThan10 =
FILTER (
ADDCOLUMNS (
SUMMARIZE (
ApplicationProcessing,
ApplicationProcessing[Month Name],
ApplicationProcessing[Application]
),
"Days", CALCULATE ( SUM ( ApplicationProcessing[Duration] ) )
),
[Days] <= 10
)
RETURN
COUNTROWS ( applicationsLessThan10 )
MoreThan10Days =
VAR applicationsMoreThan10 =
FILTER (
ADDCOLUMNS (
SUMMARIZE (
ApplicationProcessing,
ApplicationProcessing[Month Name],
ApplicationProcessing[Application]
),
"Days", CALCULATE ( SUM ( ApplicationProcessing[Duration] ) )
),
[Days] > 10
)
RETURN
COUNTROWS ( applicationsMoreThan10 )

What is the purpose of using VALUES or ALL in the first parameter of an iterator function?

I know that only CALCULATE can modify the filter context. However following are 2 example using VALUES and ALL.
Example 1:
Revenue =
SUMX(
Sales,
Sales[Order Quantity] * Sales[Unit Price]
)
Revenue Avg Order =
AVERAGEX(
VALUES('Sales Order'[Sales Order]),
[Revenue]
)
What is the purpose of VALUES in AVERAGEX function? Is this to add an additional filter context?
Example 2:
Product Quantity Rank =
RANKX(
ALL('Product'[Product]),
[Quantity]
)
What is the purpose of using ALL in an iterator function?
Suppose we have a table like this:
ID
Sales Order
Order Quantity
UnitID
Unit Price
1
101
10
4
39.99
2
101
15
3
24.99
3
102
5
2
15.99
4
103
5
1
14.99
5
103
10
3
24.99
Since the Sales Order column has duplicates,
Revenue Avg Order = AVERAGEX ( VALUES ( Sales[Sales Order] ), [Revenue] )
gives a different result than
Revenue Avg ID = AVERAGEX ( Sales, [Revenue] )
since the first averages over the three Sales Order values whereas the second averages over the five ID rows.
Using DISTINCT instead of VALUES would work too.
Using ALL is instead of VALUES gives the same total but ignores the local filter context from the table visual:
Revenue Avg All = AVERAGEX ( ALL ( Sales[Sales Order] ), [Revenue] )
In this context, ALL is acting as a table function that returns all of the distinct values of the column specified ignoring filter context.

How to produce a snapshot table using Power BI measure

My intention is to populate days of the month to simulate a data warehouse periodic snapshot table using DAX measures. My goal is to show non-additive values for the quantity.
Consider the following transactions:
The granularity of my snapshot table is day. So it should show the following:
Take note that a day may have multiple entries but I am only interested in the latest entry for the day. If I am looking at the figures using a week period it should show the latest entry for the week. It all depends on the context fixter.
However after applying the measure I end up with:
There are three transactions. Two on day 2 and the other on day 4. Instead of calculating a running total I want to show the latest Qty for the days which have no transactions without running accumulating totals. So, day 4 should show 4 instead of summing up day 3 and day 4 which gives me 10. I've been experimenting with LASTNONBLANK without much success.
This is the measure I'm using:
Snapshot =
CALCULATE(
SUM('Inventory'[Quantity]),
FILTER(
ALL ( 'Date'[Date] ),
'Date'[Date] <= MAX( 'Date'[Date] )
)
)
There are two tables involved:
Table # 1: Inventory table containing the transactions. It includes the product id, the date/time the transaction was recorded and the quantity.
Table # 2: A date table 'Date' which has been marked as a date table in Power BI. There is a relationship between the Inventory and the Date table based on a date key. So, in the measure, 'Date'[Date] refers to the Date column in the Date table.
You can use the LASTNONBLANKVALUE function, that returns the last value of the expression specified as second parameter sorted by the column specified as first parameter.
Since LASTNONBLANKVALUE implicitly wraps the second parameter into a CALCULATE, a context transition happens and therefore the row context is transformed into the corresponding filter context. So we also need to use VALUES to apply the filter context to the T[Qty] column. The returned table is a single row column and DAX can automatically convert a single column, single row table to a scalar value.
Then, since we don't have a dimension table we have to get rid of cross-filtering, therefore we must use REMOVEFILTERS over the whole table.
the filter expression T[Day] < MaxDay is needed because LASTNONBLANKVALUE must be called in a filter context containing all the rows preceding and including the current one.
So, assuming that the table name is T with fields Day and Qty like in your sample data, this code should work
Edit: changed in order to support multiple rows with same day, assuming the desired result is the sum of the last day quantities
Measure =
VAR MaxDay =
MAX ( T[Day] )
RETURN
CALCULATE (
LASTNONBLANKVALUE (
T[Day],
SUM ( T[Qty] )
),
T[Day] <= MaxDay,
REMOVEFILTERS ( T )
) + 0
Edit: after reading the comments, this might work on your model (untested)
Measure =
VAR MaxDay =
MAX ( 'Date'[Date] )
RETURN
CALCULATE (
LASTNONBLANKVALUE (
Inventory[RecordedDate],
SUM ( Inventory[Quantity] )
),
'Date'[Date] <= MaxDay
) + 0

How to get total of measures for a particular granularity in PowerBI

I have 2 tables for my book inventory. Incoming 'Buy' table and outgoing 'Sales'. There are 2 sources for Incoming, Buy list, and Others, based on a per shipment basis. And Sales has sale records for each book for a given order.
By rule, any sale made in a period should be first debited from Others column from the 'Buy' table count and the rest should be treated as the Buylist sale. (ed. For Book A, we bought 5 books from 'Buylist' and 5 from 'Other' and were able to Sale 7 books in the month of May; I will conclude that we sold 5 books from others, 2 books from 'Buylist,' irrespective of when they were bought)
My schema looks like this.
I created a Bookkey table as PWI was not allowing many to many connection between Buy and Sales. I created 2 measures to calculate the BuyBack sales and Excess inventory left from Buyback.
BuyList Sales =
IF (
( SUM ( Sales[Sales] ) - SUM ( Buy[Others] ) ) > 0,
IF (
( SUM ( Sales[Sales] ) - SUM ( Buy[Others] ) - SUM ( Buy[BuyList] ) ) >= 0,
SUM ( Buy[BuyList] ),
SUM ( Sales[Sales] ) - SUM ( Buy[Others] )
),
0
)
and
BuyList Excess = SUM( Buy[BuyList] ) - [BuyList Sales]
I am getting the correct results in the rows on the BookKey granularity, but the Totals are not correct as it is Calculated on the entire data set. Is there any other way of getting the Total while also be able to control the period.
Can anyone help me get the correct Total amount?

DAX measure: project duration (days) from dimension starting & ending date

I have following scenario which has been simplified a little:
Costs fact table:
date, project_key, costs €
Project dimension:
project_key, name, starting date, ending date
Date dimension:
date, years, months, weeks, etc
I would need to create a measure which would tell project duration of days using starting and ending dates from project dimension. The first challenge is that there isn't transactions for all days in the fact table. Project starting date might be 1st of January but first cost transaction is on fact table like 15th on January. So we still need to calculate the days between starting and ending date if on filter context.
So the second challenge is the filter context. User might want to view only February. So it project starting date is 1.6.2016 and ending date is 1.11.2016 and user wants to view only September it should display only 30 days.
The third challenge is to view days for multiple projects. So if user selects only single day it should view count for all of the projects in progress.
I'm thankful for any help which could lead towards the solution. So don't hesitate to ask more details if needed.
edit: Here is a picture to explain this better:
Update 7.2.2017
Still trying to create a single measure for this solution. Measure which user could use with only dates, projects or as it is. Separate calculated column for ongoing project counts per day would be easy solution but it would only filter by date table.
Update 9.2.2017
Thank you all for your efforts. As an end result I'm confident that calculations not based on fact table are quite tricky. For this specific case I ended up doing new table with CROSS JOIN on dates and project ids to fulfill all requirements. One option also was to add starting and ending dates as own lines to fact table with zero costs. The real solution also have more dimensions we need to take into consideration.
To get the expected result you have to create a calculated column and a measure, the calculated column lets count the number of projects in dates where projects were executed and the measure to count the number of days elapsed from [starting_date] and [ending_date] in each project taking in account filters.
The calculated column have to be created in the dim_date table using this expression:
Count of Projects =
SUMX (
FILTER (
project_dim,
[starting_date] <= EARLIER ( date_dim[date] )
&& [ending_date] >= EARLIER ( date_dim[date] )
),
1
)
The measure should be created in the project_dim table using this expression:
Duration (Days) =
DATEDIFF (
MAX ( MIN ( [starting_date] ), MIN ( date_dim[date] ) ),
MIN ( MAX ( [ending_date] ), MAX ( date_dim[date] ) ),
DAY
)
+ 1
The result you will get is something like this:
And this if you filter the week using an slicer or a filter on dim_date table
Update
Support for SSAS 2014 - DATEDIFF() is available in SSAS 2016.
First of all, it is important you realize you are measuring two different things but you want only one measure visible to your users. In the first Expected result you want to get the number of projects running in each date while in the Expected results 2 and 3 (in the OP) you want the days elapsed in each project taking in account filters on date_dim.
You can create a measure to wrap both measures in one and use HASONEFILTER to determine the context where each measure should run. Before continue with the wrapping measure check the below measure that replaces the measure posted above using DATEDIFF function which doesn't work in your environment.
After creating the previous calculated column that is required to determine the number of projects in each date, create a measure called Duration Measure, this measure won't be used by your users but lets us calculate the final measure.
Duration Measure = SUMX(FILTER (
date_dim,
date_dim[date] >= MIN ( project_dim[starting_date] )
&& date_dim[date] <= MAX ( project_dim[ending_date] )
),1
)
Now the final measure which your users should interact can be written like this:
Duration (Days) =
IF (
HASONEFILTER ( date_dim[date] ),
SUM ( date_dim[Count of Projects] ),
[Duration Measure]
)
This measure will determine the context and will return the right measure for the given context. So you can add the same measure for both tables and it will return the desired result.
Despite this solution is demonstrated in Power BI it works in Power Pivot too.
First I would create 2 relationships:
project_dim[project_key] => costs_fact[project_key]
date_dim[date] => costs_fact[date]
The Costs measure would be just: SUM ( costs_fact[costs] )
The Duration (days) measure needs a CALCULATE to change the filter context on the Date dimension. This is effectively calculating a relationship between project_dim and date_dim on the fly, based on the selected rows from both tables.
Duration (days) =
CALCULATE (
COUNTROWS ( date_dim ),
FILTER (
date_dim,
date_dim[date] >= MIN ( project_dim[starting_date] )
&& date_dim[date] <= MAX ( project_dim[ending_date] )
)
)
I suggest you to separate the measure Duration (days) into different calculated column/measure as they don't actually have the same meaning under different contexts.
First of all, create a one-to-many relationship between dates/costs and projects/costs. (Note the single cross filter direction or the filter context will be wrongly applied during calculation)
For the Expected result 1, I've created a calculated column in the date dimension called Project (days). It counts how many projects are in progress for a given day.
Project (days) =
COUNTROWS(
FILTER(
projects,
dates[date] >= projects[starting_date] &&
dates[date] <= projects[ending_date]
)
)
P.S. If you want to have aggregated results on weekly/monthly basis, you can further create a measure and aggregate Project (days).
For Expected result 2 and 3, the measure Duration (days) is as follows:
Duration (days) =
COUNTROWS(
FILTER(
dates,
dates[date] >= FIRSTDATE(projects[starting_date]) &&
dates[date] <= FIRSTDATE(projects[ending_date])
)
)
The result will be as expected: