What is the difference between SUMX(ALL...) vs CALCULATE(SUMX.., ALL..)? - powerbi

Following are 2 measures:
SUMX ( ALL ( SALES ) , SALES[AMT] )
CALCULATE ( SUMX ( SALES, SALES[AMT] ), ALL (SALES) )
Similarly for the following 2 measures:
SUMX ( FILTER ( SALES, SALES[QTY]>1 ), SALES[QTY] * SALES[AMT] )
CALCULATE ( SUMX ( SALES, SALES[QTY] * SALES[AMT] ), FILTER ( SALES, SALES[QTY]>1 ) )
Both above examples clear the natural filters on the SALES table and perform the aggregation.
I'm trying to understand what is the significance/use case of using either approach maybe in terms of logic or performance?

In DAX you can achieve the same results from different DAX queries/syntax.
So based on my understanding both the DAX provide the same result :
SUMX ( ALL ( SALES ) , SALES[AMT] )
CALCULATE ( SUMX ( SALES, SALES[AMT] ), ALL (SALES) )
And the 1st one is a more concise way to achieve way rather than the 2nd one in all cases/scenarios.
Currently when I tested these out with <100 records in a table ; the performance was the same for both the measures.
But ideally the 1st scenario would be quicker then the 2nd one which we can test out by >1 million record through DAX studio.
Can you please share your thoughts on the same?

The first uses a table function to return the whole sales table and then iterate. The second iterates over the sales table in the context of calculate which removes any filters that were present on the sales table.
SUMX ( ALL ( SALES ) , SALES[AMT] )
CALCULATE ( SUMX ( SALES, SALES[AMT] ), ALL (SALES) )
In these two DAX functions, ALL() is doing two very different things and it is unfortunate the same name was used. In the first one, ALL() is being used as a table function and returns a table. In the second one, ALL() is being used to remove filters and could be replaced with REMOVEFILTERS() (the first one cannot be replaced this same way).
This is a lengthy and detailed topic and I suggest you make a cup of coffee and have a read here: https://www.sqlbi.com/articles/managing-all-functions-in-dax-all-allselected-allnoblankrow-allexcept/
To summarise the article, ALL() and REMOVEFILTERS() are not the same. ALL() can be used where REMOVEFILTERS() is used but not vice versa.
CALCULATE ( SUMX ( SALES, SALES[QTY] * SALES[AMT] ), FILTER ( SALES, SALES[QTY]>1 ) )
This DAX uses calculate to change the filter context and remove any existing filters. The important thing is that it is removing existing filters.
They mainly achieve the same result (most of the time) but there is still more nuance though. In DAX, there are always multiple ways of achieving the same outcome. More importantly, DAX is always dependent on the evaluation context. Writing SUM(SALES[AMT]) can return different numbers depending on context. If it was in table with colour, it would return the sum per colour at each line and a total. If it were by country, it would return a total by country and a total. i.e. the exact same formula returns different results depending on context. In this simplistic example, they are essentially the same though.
The second example would also never be written it this way as you should never filter entire tables (especially fact tables). You would filter the column instead. e.g.
SUMX(
FILTER(VALUES(Sales[Quantity]),
Sales[Quantity]>1), Sales[Quantity] * Sales[SalesAmount]
)
This whole video is an excellent watch but if you watch from 45:33, you can see a good explanation of the difference between removing filters and returning a table which is the essence of your question. You also need to understand expanded tables which is explained earlier in the video. youtube.com/watch?v=teYwjHkCEm0&list=WL&index=2

At the risk of stating the obvious, you are wrapping a function (SUMX) inside a process represented by CALCULATE function.
It is an actual process, which will attempt a context transition.
Beyond the performance implications of forcing extra processing, the answer to your question heavily depends on how and where these measures get injected into the model, as it determines if the context transition would occur.
For reference, here are just some of the relevant SQLBI articles: https://www.sqlbi.com/articles/introducing-calculate-in-dax/
https://www.sqlbi.com/articles/understanding-context-transition-in-dax/

Related

Subtotal not reflecting sum of underlying row calculations

This is kind of similar to my question here, but different enough i think to justify a new question. Looking at the below table, i want to take the total of Direct Expense across all regions, and subtract that from the Americas Expense number only, then total up the result.
I'd like the last column's total to read 17,661, not 54,888. Here is a link to a sample workbook on OneDrive with the above table: https://1drv.ms/u/s!Al7VQqB8RVlWgY4mUYqouesRPNE0qw?e=IiMPnq Any ideas?
Your measure had two issues:
it was missing the context transition, this can be fixed adding
CALCULATE where a context transition is needed
The [Total Direct Expense total for Region (c)] uses ALLSELECTED,
ad it's called inside an iterator. But we can move the measure
before the iteration using a variable instead. (The number 13,703 is wrong)
The measure then becomes
Final Result (a-c) =
VAR TotalExpense = [Total Direct Expense total for Region (c)]
RETURN
SUMX (
VALUES ( Sheet1[Region] ),
IF (
Sheet1[Region] = "Americas",
CALCULATE (
SUM ( Sheet1[Total Expense (a)] ) - TotalExpense
),
CALCULATE (
SUM ( Sheet1[Total Expense (a)] )
)
)
)
Now the final result is
A rather complex article about ALLSELECTED explaining what is the shadow filter and why calling ALLSELECTED after a context transition in an iteration doesn't work as expcected can be found here https://www.sqlbi.com/articles/the-definitive-guide-to-allselected/

DAX SUMMARIZE miss applied slicers

I have a slicer, called COUNTRY and applied to table MY_TABLE. When I calculate a measure, everything works as expected:
-- calculates distinct count only for COUNTRY = x
Some Measure = DISTINCTCOUNT('MY_TABLE'[SOME_COLUMN])
The problem is SUMMARIZE ignores slicer selection:
-- calculates distinct count accross all countries: x, y, z, etc.
Calculated Table =
RETURN SUMMARIZE(
'SOME_TABLE',
[CATEGORY],
"COUNT", DISTINCTCOUNT('SOME_TABLE'[SOME_COLUMN])
)
How to make SUMMARIZE take into account slicers?
Only Measures are "responsive", calculated tables and columns get calculated and created once, when the data are loaded.
Note that if a calculated table is used inside a measure it will behave correctly, but as you may know, a measure must return a scalar value and not a table. (ie you can use summarize inside a measure, you can then filter the obtained table and return the sum of one column)
Of course, you can filter calculated table with a slicer. If you can, go for SUMMARIZECOLUMNS because this function is better optimized then SUMMARIZE, and has arguments for filtering.
Filtering SUMMARIZECOLUMNS
If you want to stick to SUMMARIZE, you can filter your table by wrapping it with CALCULATETABLE.
Calculated Table =
CALCULATETABLE (
SUMMARIZE (
'SOME_TABLE',
[CATEGORY],
"COUNT", DISTINCTCOUNT ( 'SOME_TABLE'[SOME_COLUMN] )
),
Dim[Color]
= SELECTEDVALUE ( Slicer[SlicerValues] )
)
Should FILTER be used inside or outside of SUMMARIZE?

DAX measure to return alternative result for total

How to create DAX measure returning different value for total in table visual? I would like it for conditional formatting for whatever dimension split in table visuals. But since conditional formatting does not work for totals I do not want to display it for that line.
I need something like:
IF(condition_identifying_total_line, "Alternative result", [TrafficLightIcon])
Edit. This does exactly what I want but I hope for more elegant approach or any other suggestions:
IsTotal =
COUNTROWS(FactTable) =
CALCULATE (
COUNTROWS ( FactTable ),
ALLSELECTED ( FactTable)
)
This measure works for whatever dimension split of Sales figures in table visual.
There are a variety of options depending on exactly what you want to do. I suggest taking a look a the following functions for ideas:
ISFILTERED
ISCROSSFILTERED
HASONEFILTER
HASONEVALUE
FILTERS
SELECTEDVALUE
For example, if Sales broken out by a column A, here are a couple possible approaches:
Sales = IF( HASONEVALUE( T[A] ), SUM ( T[Sales] ), <Alternative Result> )
Sales = IF( ISFILTERED ( T[A] ), <Alternative Result>, SUM ( T[Sales] ) )
You can find full documentation for how to handle granularities from the SQLBI website here: https://www.daxpatterns.com/handling-different-granularities/
Hope this helps!
William
I have ended up using INSCOPE function:
IsTotal = NOT(
ISINSCOPE(products[dimension1])
|| ISINSCOPE(products[dimension2])
|| ISINSCOPE(stores[dimension1])
|| ISINSCOPE(stores[dimension2])
)
Unfortunately it requires hard coding all dimensions by which we want to slice or group visuals.

If a exists pick a else pick b power bi

I have forecast and budget values for the year, and a new forecast is created every quarter. I need PowerBI to pick up the Metric Value (can be Budget, Q1F, Q2F and Q3F) for a given date based on data availability.
Example - If for a given date, data for Q3F is available, pick Q3F, else pick Q2F else Q1F else budget.
This is what my data looks like:
Date Metric Value
1/1/11 Budget 1.1
1/1/11 Q3F 1.2
1/1/11 Q2F 1.3
In this case the function should pick up Q3F since it's available.
One way to solve this would be by using both a SUMX and a SWITCH Statements.
To start with assign a constant to your forecasts, e.g. budget = 1, Q1F = 2 and so on as a column on your data. The idea more recent forecast will have a higher number, it will be used in the switch statement late. I am going to refer to it as forecast_ID in this example.
I am also assuming you have a calendar table, also that your forecasts are entered in entirely for the business and not in waves. E.g. Category A is still on budget, Category B is an updated forecast.
The idea of below, is that the SUMX iterates though each quarter that you are looking at, e.g. 2018 would run Q1, Q2, Q3, Q4 separately.
Within the context of each quarter, it is getting the MAX of your forecast IDs, which is then used in the switch to select the most recent forecast.
Measure :=
SUMX (
VALUES ( Calendar[Quarter] ),
SWITCH (
MAX ( table1[forecast_ID] ),
1, CALCULATE ( SUM ( table1[value] ), table1[Metric] = "Budget" ),
2, CALCULATE ( SUM ( table1[value] ), table1[Metric] = "Q1F" )
)
)
You could also then do something like MAX ( table1[forecast_ID] - 1) for finding the previous forecast dynamically.
If you always want to pull the most recent value then you can use LASTNONBLANK
As Marcus mentioned, the first step is to create a constant for your forecasts. I set up a separate table for this example.
Then you can create a relationship between the two tables based on Metric. Add a calculated column to your original table
MetricConstant = RELATED(Table2[Constant])
Now create a measure to pull the most recent value within each date period
Measure =
SUMX (
VALUES ( Table1[Date] ),
CALCULATE ( SUM ( Table1[Value] ), LASTNONBLANK ( Table1[MetricConstant], 1 ) )
)
Now when you pull in the Date and the Measure it will only show you the most recent available
EDIT-Updated based on comments. If you want to view which Metric is being used you need another measure
MetricMeasure = CALCULATE(MAX(Table1[Metric]),LASTNONBLANK(Table1[MetricConstant],1))
You could create an area chart based on that, and add this to the tooltip.

DAX measure: project duration (days) from dimension starting & ending date

I have following scenario which has been simplified a little:
Costs fact table:
date, project_key, costs €
Project dimension:
project_key, name, starting date, ending date
Date dimension:
date, years, months, weeks, etc
I would need to create a measure which would tell project duration of days using starting and ending dates from project dimension. The first challenge is that there isn't transactions for all days in the fact table. Project starting date might be 1st of January but first cost transaction is on fact table like 15th on January. So we still need to calculate the days between starting and ending date if on filter context.
So the second challenge is the filter context. User might want to view only February. So it project starting date is 1.6.2016 and ending date is 1.11.2016 and user wants to view only September it should display only 30 days.
The third challenge is to view days for multiple projects. So if user selects only single day it should view count for all of the projects in progress.
I'm thankful for any help which could lead towards the solution. So don't hesitate to ask more details if needed.
edit: Here is a picture to explain this better:
Update 7.2.2017
Still trying to create a single measure for this solution. Measure which user could use with only dates, projects or as it is. Separate calculated column for ongoing project counts per day would be easy solution but it would only filter by date table.
Update 9.2.2017
Thank you all for your efforts. As an end result I'm confident that calculations not based on fact table are quite tricky. For this specific case I ended up doing new table with CROSS JOIN on dates and project ids to fulfill all requirements. One option also was to add starting and ending dates as own lines to fact table with zero costs. The real solution also have more dimensions we need to take into consideration.
To get the expected result you have to create a calculated column and a measure, the calculated column lets count the number of projects in dates where projects were executed and the measure to count the number of days elapsed from [starting_date] and [ending_date] in each project taking in account filters.
The calculated column have to be created in the dim_date table using this expression:
Count of Projects =
SUMX (
FILTER (
project_dim,
[starting_date] <= EARLIER ( date_dim[date] )
&& [ending_date] >= EARLIER ( date_dim[date] )
),
1
)
The measure should be created in the project_dim table using this expression:
Duration (Days) =
DATEDIFF (
MAX ( MIN ( [starting_date] ), MIN ( date_dim[date] ) ),
MIN ( MAX ( [ending_date] ), MAX ( date_dim[date] ) ),
DAY
)
+ 1
The result you will get is something like this:
And this if you filter the week using an slicer or a filter on dim_date table
Update
Support for SSAS 2014 - DATEDIFF() is available in SSAS 2016.
First of all, it is important you realize you are measuring two different things but you want only one measure visible to your users. In the first Expected result you want to get the number of projects running in each date while in the Expected results 2 and 3 (in the OP) you want the days elapsed in each project taking in account filters on date_dim.
You can create a measure to wrap both measures in one and use HASONEFILTER to determine the context where each measure should run. Before continue with the wrapping measure check the below measure that replaces the measure posted above using DATEDIFF function which doesn't work in your environment.
After creating the previous calculated column that is required to determine the number of projects in each date, create a measure called Duration Measure, this measure won't be used by your users but lets us calculate the final measure.
Duration Measure = SUMX(FILTER (
date_dim,
date_dim[date] >= MIN ( project_dim[starting_date] )
&& date_dim[date] <= MAX ( project_dim[ending_date] )
),1
)
Now the final measure which your users should interact can be written like this:
Duration (Days) =
IF (
HASONEFILTER ( date_dim[date] ),
SUM ( date_dim[Count of Projects] ),
[Duration Measure]
)
This measure will determine the context and will return the right measure for the given context. So you can add the same measure for both tables and it will return the desired result.
Despite this solution is demonstrated in Power BI it works in Power Pivot too.
First I would create 2 relationships:
project_dim[project_key] => costs_fact[project_key]
date_dim[date] => costs_fact[date]
The Costs measure would be just: SUM ( costs_fact[costs] )
The Duration (days) measure needs a CALCULATE to change the filter context on the Date dimension. This is effectively calculating a relationship between project_dim and date_dim on the fly, based on the selected rows from both tables.
Duration (days) =
CALCULATE (
COUNTROWS ( date_dim ),
FILTER (
date_dim,
date_dim[date] >= MIN ( project_dim[starting_date] )
&& date_dim[date] <= MAX ( project_dim[ending_date] )
)
)
I suggest you to separate the measure Duration (days) into different calculated column/measure as they don't actually have the same meaning under different contexts.
First of all, create a one-to-many relationship between dates/costs and projects/costs. (Note the single cross filter direction or the filter context will be wrongly applied during calculation)
For the Expected result 1, I've created a calculated column in the date dimension called Project (days). It counts how many projects are in progress for a given day.
Project (days) =
COUNTROWS(
FILTER(
projects,
dates[date] >= projects[starting_date] &&
dates[date] <= projects[ending_date]
)
)
P.S. If you want to have aggregated results on weekly/monthly basis, you can further create a measure and aggregate Project (days).
For Expected result 2 and 3, the measure Duration (days) is as follows:
Duration (days) =
COUNTROWS(
FILTER(
dates,
dates[date] >= FIRSTDATE(projects[starting_date]) &&
dates[date] <= FIRSTDATE(projects[ending_date])
)
)
The result will be as expected: