This is kind of similar to my question here, but different enough i think to justify a new question. Looking at the below table, i want to take the total of Direct Expense across all regions, and subtract that from the Americas Expense number only, then total up the result.
I'd like the last column's total to read 17,661, not 54,888. Here is a link to a sample workbook on OneDrive with the above table: https://1drv.ms/u/s!Al7VQqB8RVlWgY4mUYqouesRPNE0qw?e=IiMPnq Any ideas?
Your measure had two issues:
it was missing the context transition, this can be fixed adding
CALCULATE where a context transition is needed
The [Total Direct Expense total for Region (c)] uses ALLSELECTED,
ad it's called inside an iterator. But we can move the measure
before the iteration using a variable instead. (The number 13,703 is wrong)
The measure then becomes
Final Result (a-c) =
VAR TotalExpense = [Total Direct Expense total for Region (c)]
RETURN
SUMX (
VALUES ( Sheet1[Region] ),
IF (
Sheet1[Region] = "Americas",
CALCULATE (
SUM ( Sheet1[Total Expense (a)] ) - TotalExpense
),
CALCULATE (
SUM ( Sheet1[Total Expense (a)] )
)
)
)
Now the final result is
A rather complex article about ALLSELECTED explaining what is the shadow filter and why calling ALLSELECTED after a context transition in an iteration doesn't work as expcected can be found here https://www.sqlbi.com/articles/the-definitive-guide-to-allselected/
Related
Following are 2 measures:
SUMX ( ALL ( SALES ) , SALES[AMT] )
CALCULATE ( SUMX ( SALES, SALES[AMT] ), ALL (SALES) )
Similarly for the following 2 measures:
SUMX ( FILTER ( SALES, SALES[QTY]>1 ), SALES[QTY] * SALES[AMT] )
CALCULATE ( SUMX ( SALES, SALES[QTY] * SALES[AMT] ), FILTER ( SALES, SALES[QTY]>1 ) )
Both above examples clear the natural filters on the SALES table and perform the aggregation.
I'm trying to understand what is the significance/use case of using either approach maybe in terms of logic or performance?
In DAX you can achieve the same results from different DAX queries/syntax.
So based on my understanding both the DAX provide the same result :
SUMX ( ALL ( SALES ) , SALES[AMT] )
CALCULATE ( SUMX ( SALES, SALES[AMT] ), ALL (SALES) )
And the 1st one is a more concise way to achieve way rather than the 2nd one in all cases/scenarios.
Currently when I tested these out with <100 records in a table ; the performance was the same for both the measures.
But ideally the 1st scenario would be quicker then the 2nd one which we can test out by >1 million record through DAX studio.
Can you please share your thoughts on the same?
The first uses a table function to return the whole sales table and then iterate. The second iterates over the sales table in the context of calculate which removes any filters that were present on the sales table.
SUMX ( ALL ( SALES ) , SALES[AMT] )
CALCULATE ( SUMX ( SALES, SALES[AMT] ), ALL (SALES) )
In these two DAX functions, ALL() is doing two very different things and it is unfortunate the same name was used. In the first one, ALL() is being used as a table function and returns a table. In the second one, ALL() is being used to remove filters and could be replaced with REMOVEFILTERS() (the first one cannot be replaced this same way).
This is a lengthy and detailed topic and I suggest you make a cup of coffee and have a read here: https://www.sqlbi.com/articles/managing-all-functions-in-dax-all-allselected-allnoblankrow-allexcept/
To summarise the article, ALL() and REMOVEFILTERS() are not the same. ALL() can be used where REMOVEFILTERS() is used but not vice versa.
CALCULATE ( SUMX ( SALES, SALES[QTY] * SALES[AMT] ), FILTER ( SALES, SALES[QTY]>1 ) )
This DAX uses calculate to change the filter context and remove any existing filters. The important thing is that it is removing existing filters.
They mainly achieve the same result (most of the time) but there is still more nuance though. In DAX, there are always multiple ways of achieving the same outcome. More importantly, DAX is always dependent on the evaluation context. Writing SUM(SALES[AMT]) can return different numbers depending on context. If it was in table with colour, it would return the sum per colour at each line and a total. If it were by country, it would return a total by country and a total. i.e. the exact same formula returns different results depending on context. In this simplistic example, they are essentially the same though.
The second example would also never be written it this way as you should never filter entire tables (especially fact tables). You would filter the column instead. e.g.
SUMX(
FILTER(VALUES(Sales[Quantity]),
Sales[Quantity]>1), Sales[Quantity] * Sales[SalesAmount]
)
This whole video is an excellent watch but if you watch from 45:33, you can see a good explanation of the difference between removing filters and returning a table which is the essence of your question. You also need to understand expanded tables which is explained earlier in the video. youtube.com/watch?v=teYwjHkCEm0&list=WL&index=2
At the risk of stating the obvious, you are wrapping a function (SUMX) inside a process represented by CALCULATE function.
It is an actual process, which will attempt a context transition.
Beyond the performance implications of forcing extra processing, the answer to your question heavily depends on how and where these measures get injected into the model, as it determines if the context transition would occur.
For reference, here are just some of the relevant SQLBI articles: https://www.sqlbi.com/articles/introducing-calculate-in-dax/
https://www.sqlbi.com/articles/understanding-context-transition-in-dax/
I was ding some DAX exercises and came across this measure:
Average Return on Investment =
DIVIDE (
SUMX (
fact_RoI;
fact_RoI[Return on Investment] * [SumInvestment]
);
SUMX ( fact_RoI;[SumInvestment] )
)
This measure works, but can someone explain to me why they are using a SUMX in the bottom part as well? I tried just using the SumInvestments measure instead and got very wrong result, but I do not understand why.
edit: adding this as requested (sorry, didn't think it was relevant)
SumInvestment =
CALCULATE (
SUM ( fact_Investments[Investment] );
CROSSFILTER (
fact_RoI[Product];
dim_Product[Product];
Both))
SUMX ( fact_RoI;[SumInvestment] ) iterates over the fact_RoI table, however the Measure SumInvestment sums a column from another table (fact_Investments). Thus if you want to obtain a calculation based on the RoI you would need to iterate over it as in the example.
Otherwise the SUM of fact_Investment probably just produces a meaningless result, like: sum all investments regardless of their connection to RoI.
I have the following graph created. It tracks the count of a certain even in a quarter by groups (i erased the group names and renamed them (ABC's due to sensitive data).
I need the graph to show the cumulative value that is to say for example. Q1 A=1, Q2 A=3, Q3 A=5.
I have played around with quick measures but I can't seem to make them breakdown the accumulation by group, Only quarter (Q1 =1, Q2 =6, etc).
I think i need to create a quick-measure of a quick-measure but I am not sure the order and what the measures would look like.
There are only 2 relevant fields: date_of_event and group
X axis: date of event (by year and quarter), group
y axis: count of date_of event
Thanks
For this, you'll definitely benefit from a date dimension and a dimension for your group. There are many template date dimensions out there, but I'm partial to mine. A group dimension for you may be as simple as just taking the distinct values of your existing [Group] field.
Time intelligence is basically always easier when your model is dimensionalized.
With that, you'd set up relationships as below:
'DimDate'[Date] -1:N-> 'YourEventTable'[Date_Of_Event]
'DimGroup'[Group] -1:N-> 'YourEventTable'[Group]
With that in place, you can use the built-in time intelligence functions or roll your own (examples of rolling your own in my linked date dimension repo).
Events = COUNTROWS ( 'YourEventTable' )
Events YTD = TOTALYTD ( [Events], 'DimDate'[Date] )
If you need an all-time cumulative, instead, you can use this:
Events All-time Cumulative =
VAR CurrentDate = MAX ( 'DimDate'[Date] )
RETURN
CALCULATE (
[Events],
ALL ( 'DimDate' ),
'DimDate'[Date] <= CurrentDate
)
Make sure to always use dimension fields for axis labels and such, and never the same from the fact table.
I had encounter it early this week and below is my DAX for the Cumulative total measure,
Cumulative Total =
CALCULATE (
SUMX (
Table,
IF ( DISTINCTCOUNT ( Table[UserID] ) > 0, 1, 0 ) //Put your Group Here
),
FILTER (
ALLSELECTED ( Table ),
Table[InitialAccessDate] //Date of event
<= MAX ( Table[InitialAccessDate] ) //Date of event
)
)
I hope it helps!! Cheers!!
I have a rather complicated data set, but will attempt to simplify it for this post.
I have contract data in table F1_Contract with a simplified format of:
I am attempting to calculate the expected profit from contracts.
The first thing that I needed is to calculate the incremental Volume from each contract that was valid between the Current Date and the Next Date, depending upon the date slicer used in the view. After much pain and anguish, someone pointed me to a post on StackOverflow that resolved my issue:
Create a Disconnected Date table, use that as the Date Slicer, then calculate the date difference between the Current Date, START_DATE of the slicer, Next Date, and END_DATE of the slicer.
The resulting Measure is
DELTA DATE =
CALCULATE (
SUMX (
F1_Contract,
DATEDIFF (
MAX ( MAX ( F1_Contract[CURRENT_CONTRACT_DATE] ), [Disconnected_MIN_Date] ),
MIN ( MAX ( F1_Contract[NEXT_CONTRACT_DATE] ), [Disconnected_MAX_Date] ),
DAY
)
),
FILTER (
F1_Contract,
F1_Contract[CURRENT_CONTRACT_DATE] <= [Disconnected_MAX_Date]
&& F1_Contract[NEXT_CONTRACT_DATE] >= [Disconnected_MIN_Date]
)
)
I then take that Measure and multiply it by the VOLUME_PER_DAY to get the incremental Volume for the view with the following formula:
Incremental Cumulative VOLUME =
CALCULATE(SUMX(F1_Contract,F1_Contract[VOLUME_PER_DAY]* [DELTA DATE]))
To calculate F1 Revenue and F1 Cost, I take the F1 Unit Cost and the appropriate F1 price based on the Incremental Volume and derive the following measures:
Incremental F1 Revenue =
CALCULATE (
MAX (
SUMX (
F1_Contract,
[Incremental Cumulative VOLUME] * [F1 Sell Rate # GAD Per Shipment]
),
[Calc F1 MinCharge]
)
)
Incremental F1 Cost =
CALCULATE (
SUMX ( F1_Contract, [Incremental Cumulative VOLUME] * F1_Contract[F1_Cost] )
)
This all works great! I can create a report at the ID level, Indicator level, or the Lane level and all of the numbers are correct.
The problem is that I have a second revenue table, F2_Contract_Revenue, that consists of F2 revenues formatted like the following (note there may be 0 to 15 rows in F2_Contract_Revenue for any given ID in F1_Contract)
F2_Contract_Revenue:
Although the ID in F1_Contract is unique, just to be on the safe side I have a separate DISTINCT_ID table that I have used to link the ID from F1_Contract and F2_Contract_Revenue.
Now I need to calculate the F2 revenue for each ID; using a visual formula of:
If(BASIS = “FLAT”, F2_Unit_Rev, MAX(F2_Min, (Incremental Volume * F2_Unit_Rev))
The Measure I created after about 30 attempts is:
F2 Revenue =
CALCULATE (
(
SUMX (
F2_Contract_Revenue,
(
MAX (
[Incremental Cumulative VOLUME]
* IF ( F2_Contract_Revenue[BASIS] = "RATE",
F2_Contract_Revenue[F2_Unit_Rev], 0 ),
F2_Contract_Revenue[F2_Min]
)
)
+ IF ( F2_Contract_Revenue[BASIS] = "FLAT",
F2_Contract_Revenue[F2_Unit_Rev], 0 )
)
),
FILTER (
F2_Contract_Revenue,
F2_Contract_Revenue[ID] = RELATED ( F1_Contract[ID] )
)
)
This works correctly at the Lane level. However, in the views at the ID level, it is slightly off (I have not been able to track down why) and at the Indicator level is it exponentially off.
I need to use this in a formula that will be represented as
F1 Revenue + F2 Revenue – F1 Cost which is of course also exponentially off at the INDICATOR level (note there are multiple rows of INDICATOR = 1 and a single row of INDICATOR = 2).
The data is proprietary, so I cannot share the PowerBI file, however, I can answer more specific questions with the data that I’ve cleaned up here.
Any advice, thoughts, corrections, help is greatly anticipated and appreciated!!!
[Suggest you show the moedel with Relation direction]
At a quick view of this, I think the problem is the data model - relation direction.
you work fine on [Incremental F1 Revenue] (ID level, Indicator level, or the Lane level).
these are all in 'F1 table'.
But, when it comes to [F2 Revenue], you get the Problem.
(measure is involve 'F1 table' & 'F2 table').
FILTER (
F2_Contract_Revenue,
F2_Contract_Revenue[ID] = RELATED ( F1_Contract[ID] )
Also you said that you
just to be on the safe side I have a separate DISTINCT_ID table
so I like point out is, may you show your model (dimTable - factTable) for problem shooting.
In Dax, its all about Relation and Model XD
(after you know about dax so much, model might be the really problem.)
after reading a lot more posts and going through helpful videos, I determined the problem was in the ordering of the values in my aggregation statement as well as insufficient bounding of the time dimension.
I have three steps
a) for each row of the F2 table, determine if the value is a FLAT or RATE; if FLAT, use that value, if RATE, then multiply that value by the dynamic measure that is determining volume based on the report Date Slicer
b) compare the FLAT or RATE outcome to a MIN value in the table
c) aggregate those values using the link between the F1 and F2 tables.
The calculation that works is:
F2 =
CALCULATE
(SUMX(F2_Contract_Revenue,
((MAX(IF(F2_Contract_Revenue[BASIS]="RATE",F2_Contract_Revenue[F2_Unit_Rev]* [Incremental Cumulative Volume],F2_Contract_Revenue[F2_Unit_Rev]),F2_Contract_Revenue[F2_MIN]))) )
,FILTER(F2_Contract_Revenue,F2_Contract_Revenue[ID]=RELATED(F1_Contract[ID]))
,FILTER ( F1_Contract,
F1_Contract[CURRENT_CONTRACT_DATE] <= [Disconnected_MAX_Date]
&& F1_Contract[NEXT_CONTRACT_DATE] >= [Disconnected_MIN_Date] ))
I appreciate the response and for the next issue I will create a generic modle that I can post as an example.
I have following scenario which has been simplified a little:
Costs fact table:
date, project_key, costs €
Project dimension:
project_key, name, starting date, ending date
Date dimension:
date, years, months, weeks, etc
I would need to create a measure which would tell project duration of days using starting and ending dates from project dimension. The first challenge is that there isn't transactions for all days in the fact table. Project starting date might be 1st of January but first cost transaction is on fact table like 15th on January. So we still need to calculate the days between starting and ending date if on filter context.
So the second challenge is the filter context. User might want to view only February. So it project starting date is 1.6.2016 and ending date is 1.11.2016 and user wants to view only September it should display only 30 days.
The third challenge is to view days for multiple projects. So if user selects only single day it should view count for all of the projects in progress.
I'm thankful for any help which could lead towards the solution. So don't hesitate to ask more details if needed.
edit: Here is a picture to explain this better:
Update 7.2.2017
Still trying to create a single measure for this solution. Measure which user could use with only dates, projects or as it is. Separate calculated column for ongoing project counts per day would be easy solution but it would only filter by date table.
Update 9.2.2017
Thank you all for your efforts. As an end result I'm confident that calculations not based on fact table are quite tricky. For this specific case I ended up doing new table with CROSS JOIN on dates and project ids to fulfill all requirements. One option also was to add starting and ending dates as own lines to fact table with zero costs. The real solution also have more dimensions we need to take into consideration.
To get the expected result you have to create a calculated column and a measure, the calculated column lets count the number of projects in dates where projects were executed and the measure to count the number of days elapsed from [starting_date] and [ending_date] in each project taking in account filters.
The calculated column have to be created in the dim_date table using this expression:
Count of Projects =
SUMX (
FILTER (
project_dim,
[starting_date] <= EARLIER ( date_dim[date] )
&& [ending_date] >= EARLIER ( date_dim[date] )
),
1
)
The measure should be created in the project_dim table using this expression:
Duration (Days) =
DATEDIFF (
MAX ( MIN ( [starting_date] ), MIN ( date_dim[date] ) ),
MIN ( MAX ( [ending_date] ), MAX ( date_dim[date] ) ),
DAY
)
+ 1
The result you will get is something like this:
And this if you filter the week using an slicer or a filter on dim_date table
Update
Support for SSAS 2014 - DATEDIFF() is available in SSAS 2016.
First of all, it is important you realize you are measuring two different things but you want only one measure visible to your users. In the first Expected result you want to get the number of projects running in each date while in the Expected results 2 and 3 (in the OP) you want the days elapsed in each project taking in account filters on date_dim.
You can create a measure to wrap both measures in one and use HASONEFILTER to determine the context where each measure should run. Before continue with the wrapping measure check the below measure that replaces the measure posted above using DATEDIFF function which doesn't work in your environment.
After creating the previous calculated column that is required to determine the number of projects in each date, create a measure called Duration Measure, this measure won't be used by your users but lets us calculate the final measure.
Duration Measure = SUMX(FILTER (
date_dim,
date_dim[date] >= MIN ( project_dim[starting_date] )
&& date_dim[date] <= MAX ( project_dim[ending_date] )
),1
)
Now the final measure which your users should interact can be written like this:
Duration (Days) =
IF (
HASONEFILTER ( date_dim[date] ),
SUM ( date_dim[Count of Projects] ),
[Duration Measure]
)
This measure will determine the context and will return the right measure for the given context. So you can add the same measure for both tables and it will return the desired result.
Despite this solution is demonstrated in Power BI it works in Power Pivot too.
First I would create 2 relationships:
project_dim[project_key] => costs_fact[project_key]
date_dim[date] => costs_fact[date]
The Costs measure would be just: SUM ( costs_fact[costs] )
The Duration (days) measure needs a CALCULATE to change the filter context on the Date dimension. This is effectively calculating a relationship between project_dim and date_dim on the fly, based on the selected rows from both tables.
Duration (days) =
CALCULATE (
COUNTROWS ( date_dim ),
FILTER (
date_dim,
date_dim[date] >= MIN ( project_dim[starting_date] )
&& date_dim[date] <= MAX ( project_dim[ending_date] )
)
)
I suggest you to separate the measure Duration (days) into different calculated column/measure as they don't actually have the same meaning under different contexts.
First of all, create a one-to-many relationship between dates/costs and projects/costs. (Note the single cross filter direction or the filter context will be wrongly applied during calculation)
For the Expected result 1, I've created a calculated column in the date dimension called Project (days). It counts how many projects are in progress for a given day.
Project (days) =
COUNTROWS(
FILTER(
projects,
dates[date] >= projects[starting_date] &&
dates[date] <= projects[ending_date]
)
)
P.S. If you want to have aggregated results on weekly/monthly basis, you can further create a measure and aggregate Project (days).
For Expected result 2 and 3, the measure Duration (days) is as follows:
Duration (days) =
COUNTROWS(
FILTER(
dates,
dates[date] >= FIRSTDATE(projects[starting_date]) &&
dates[date] <= FIRSTDATE(projects[ending_date])
)
)
The result will be as expected: