DAX AVERAGE() function differs from table average of column

DAX AVERAGE() function differs from table average of column - powerbi

I am new to DAX and running into a problem with regard to averages.
In PowerBI I am using a table with some dimensions and measures, the measures obviously require some form of summary so the dimension can roll up/summarize the data.
The problem is, I have a measure let's call Minutes that I want to average across the 5 lines of data, I drag the column minutes on set the summary in the table to average and it works perfectly by splitting the data according to the dimensions - data example:
1. A 10
2. A 8
3. A 10
4. A 7
5. A 5
6. B 10
7. B 10
8. B 9
9. B 9
10. B 10
Output in Table:
1. A 8
2. B 9.6
If I want to use that Average of 8 and 9.6 in another calculation and I create a new column called AvgMins = AVERAGE(Minutes) and drag it onto the grid I get a value of 8.8 for both A and B - I understand that the most likely reason for this is due to the calculation happening before the dimension splits and therefore the grid can't handle it - but how do I handle this in the DAX column calc itself?
As pointed out by Jos I was creating the calculation as a column instead of a measure, changing to a measure the normal AVERAGE() works perfectly

You should be using a Measure, not a Calculated Column.

A measure can result in a lot of different numbers - depending on the filter context.
AbgMins = AVERAGE('Table'[Minutes])
w/o any filter will return the average of the Minutes column, which is 8.8. But if you filter it by your category - A and B - it will return the average for all A's and all B's, which is 8 and 9.6.

If you are looking for Average of Average.
AVERAGEX(
VALUES('Table'[CategoryColumnName])
,CALCULATE(AVEARGE('Table'[Minutes]))
)
This if you are looking for Average per category
CALCULATE(
AVERAGE(
AVEARGE('Table'[Minutes]))
,ALLEXCEPT('Table','Table'[CategoryColumnName])
)
)

Related

Power BI: Calculate Opposite Row Count from Measures

I have a dashboard with a table that shows row counts for a database table according to their codes.
I have divided these counts between measures using CALCULATE to determine whether a row in the table contains a specific code. For example,
measure1 = CALCULATE(COUNT(table[id]),table[code]=1)
I made 5 measures with specific codes so that they are shown in the table as columns. I also use ANDs and ORs in the measures when comparing two codes, if more complex logic is needed.
Now, I have an idea for a new measure. I want to count all of the rows except the rows from the 5 previously created measures. How can I use those 5 measures in the logic of my new measure? Or, is there any other way to solve my problem?

You wouldn't be able to reference your 5 previous measures directly in your new measure, since those measures result in a number (count), but don't give you any information about the logic behind how that number is calculated.
To accomplish what you want, you basically just want to negate the logic from those 5 measures directly in your new measure. For example, if 2 of your previous measures look like the following.
Measure 1 = CALCULATE(COUNT(table[id]), table[code] = 1)
and
Measure 2 = CALCULATE(COUNT(table[id]), table[code] = 2)
Then your new measure, which calculates the number of rows NOT counted in the above measures, would look like the following.
Measure 3 = CALCULATE(COUNT(table[id]), NOT(table[code] = 1), NOT(table[code] = 2))
You would do this negation with the logic from all 5 of your previous measures, rather than just the 2 given in the example above.

How to create a measure that goes first through equipment and then sums it all

I have a database that has some values as "Date", "StopedTime", "PlannedProductionQtt" and "PlannedProductionTime". These values are sorted by equipment, as the little example below.
What I need to do is divide PlannedProductionQtt by PlannedProductionTime and then multiply by StoppedTime. After this, I want to make a graph that shows it day by day.
At first I thought it was easy, made a new measure PlannedProductionQtt/PlannedProductionTime = SUM(PlannedProductionQtt)/SUM(PlannedProductionTime) (assume it worked without the table name).
And then I did another measure Impact = SUM(StoppedTime)*PlannedProductionQtt/PlannedProductionTime.
When I plotted a clustered column chart with this measure in values and a the day for the axis, at first I thought I had nailed it, but no. The BI summed all of PlannedProductionQtt and divided by the sum of all PlannedProductionTime for the day, and multiplied by the sum of the StoppedTime of that day.
Unfortunately, this gives me wrong results. So, what I need is a measure (or some measures) that would make it equipment by equipment and the sum it by day.
I don't want to make new tables or columns for theses calculations because I actually have 32 items of equipment, 3+ years of data, more than 1 classification of StoppedTime and the databases for PlannedProduction use more than one line per day per equipment.
To make it clear I added one column as Impact to show the difference.
So, if I sum the column Impact per day, I would have for day 1,2 and 3 the results 110725, 61273 and 220833.
However, if I sum first all the PlannedProductionQtt for day 1, divide it by the sum of PlannedProductionTime of day 1 and multiply it by the sum of StoppedTime of day 1 (which is how PowerBi is calculating) I will have 146497.
I inserted the difference in the table below to make the differences clear:

As Jon suggested in a comment, here is what solved my needs:
measure_name = SUMX( source_table , DIVIDE ( source_table[PlannedProductionQtt] , source_table[PlannedProductionTime] , 0 ) ) * SUM( source_table[StoppedTime] )

You have two different types of data you want to divide there, time and int, so you would probably need to unify that. Easiest way to do it would be from the Transform data panel, selecting the column and changing its
format
The division is done fairly easily, can you try creating a new measure as follows
measure_name = CALCULATE(
DIVIDE(<source_table>[PlannedProductionQtt],
<source_table>[PlannedProductionTime],
0)
* <source_table>[StoppedTime]
)
Then it's only a matter of using it as values in a graph and the 'Date' column in x axis.

Power BI DAX Sum By Year

Very new to Power BI and DAX and would appreciate a push in the right direction with this, seemingly simple, scenario, please. I have the following dataset:
Job Name Quarter Year Cost
alpha 1 2019 210
alpha 2 2019 100
alpha 3 2019 90
alpha 4 2019 28
beta 1 2020 100
kappa 1 2019 100
kappa 2 2019 90
beta 2 2020 100
beta 3 2020 75
beta 4 2020 30
kappa 3 2019 10
kappa 4 2019 30
All I am trying to do is get a measure/calculated column which calculates the total [Cost] per [Job Name] for each year. So for example I would get, for alpha, the value: 428. For kappa it would be: 230.
Thanks!

Use the below DAX to obtain the above result:
New Table :
SUMMARIZE('Table','Table'[Job Name],"New Column",SUM('Table'[Cost]))

Using SUMMARIZE (as #Shilpa suggests) will return/add a new table to your report data source. I think this is not your requirement.
As you are looking for a solution using a measure or a calculated column, lets know a bit details about them as they are not same with the functionality-
Calculated columns:
Evaluated for each row in your table, immediately after you hit 'Enter' to complete the formula
Calculate new values from existing values for each specific rows. For example, if you have value 5 in column "A" and 4 in column "B", you can create a new Calculated column C as (A x B) which will store the result 20. This will generate results in all rows using the calculation (A x B).
Result in Calculated column saved to the model as like other column's value.
Measures:
Evaluated when you use it in a visual and the visual is rendered
Measure always holds the aggregated value like - SUM, AVERAGE, COUNT.
Not saved anywhere (well, actually there's a cache in the report layer but it's not part of the file when you hit Save)
Now, for your scenario/requirement I think you need a simple measure as below-
total_cost = SUM('your_table'[Cost])
Your measure is ready now. Just pull column "Job Name" and measure "total_cost" to you visual. You will get your expected output. You can use slicer to check your value in/for different dimension. Just play around :)

Use a slicer on a calculated column

I need to find one way or another the following formula in Power BI:
Total Hours of Use of a Machine = Hours Function * Range of Functioning
where Hours Function is the hours of use of a certain machine. Take it at a cost that for each machine is a constant and Range of Functioning is the difference between the final date of the evaluation and the initial date, measured in hours.
For example, I want to measure the Total Hour Use of a Machine in between 15/10/2019 and 14/20/2019. So the math is the following:
Assume: 2 machines
Hours Function machine A: 6
Hours Function machine B: 9
Range of Functioning = 15/10/2019 - 14/10/2019 = 24 hours
The output:
Total Hours of Use of a Machine A: 144
Total Hours of Use of a Machine B: 216
I need to do that in Power BI in a way that any user moving a slicer of date, refresh the Total Hours of Use of a Machine.
I don't find any way that I can get the difference between the final date of the evaluation and the initial date and put in DAX or a new column.

You have to use measures if you want to recalculate the value when you change the date with a slicer.
The first step is to be sure to be able to calculate the number of day selected by your slicer.
It seems to be easy but if you use the function FirstDate on your calendar table directly integrated in PowerBI.
You'll never have what you expect.
The tricks here to get this number of day is to calculate the number of rows in your calendar table with the function countrows.
When you have this number day you just have to multiply this by 24 ( hours) and by the sum of your "Hours Function machine".( 6 for A 9 for B in your example )
( It's important to use the sum or another aggregate function like average because if you have multiple value the measure fall in error because it need only one value to multiply).
The dax formula looks like :
= COUNTROWS(('Calendar')) * Sum(Machine[Hours function])
You can then display this measure filtered by the Machine Name and a date slicer(based on your calendar table).

Power BI: Calculating STDEVX.P over 6-Month period

I am attempting to calculate the most recent 6-Month STDEVX.P (not including the current month; so in May 2017, I'd like to the STDEVX.P for periods Nov 2016 - Apr 2017) for sales by product in order to further calculate variation in sales orders.
The Sales Data is made up of daily transactions so it contains transaction date: iContractsChargebacks[TransactionDate] and units sold: iContractsChargebacks[ChargebackUnits], but if there are no sales in a given period, then there will be no data for that month.
So, for example, on July 1st, sales for the past 6 months were the following:
Jan 100
Feb 125
Apr 140
May 125
Jun 130
March is missing because there were no sales. So, when I calculate STDEVX.P on the data set, it is calculating it over 5 periods, when in fact there were 6, just one happens to be zero.
At the end of the day, I need to calculate STDEVX.P for the current six month period. If when pulling the monthly sales numbers, it only comes back with 3 periods(months), then it needs to assume the other 3 periods with a zero value.
I thought about manually calculating standard deviation instead of using the DAX STDEVX.P formula and found these 2 links as a reference on how to do so, the first being closest to my need:
https://community.powerbi.com/t5/Desktop/Problem-with-STDEV/td-p/19731
Calculating the standard deviation from columns of values and frequencies in Power BI...
I attempted to make a go of it, but still am not getting the correct calculation. My code is:
STDEVX2 =
var Averageprice=[6M Sales]
var months=6
return
SQRT(
DIVIDE(SUMX(
FILTER(ALL(DimDate),
DimDate[Month ID]<=(MAX(DimDate[Month ID])-1) &&
DimDate[Month ID]>=(MAX(DimDate[Month ID])-6)
),
(iContractsChargebacks[SumOfOrderQuantity]-Averageprice)^2),
months
)
)
*note: Instead of using date parameters in the code, I created a calculated column in the date table that gives each Month a unique ID, makes it easier for me.

Your question would definitely be easier to answer with more explanation regarding your model. E.g. how you defined [SumOfOrderQuantity] and [6M Sales], since a mistake there could definitely impact the final result. Also, knowing what the result you're seeing is vs. the result you expect would be helpful (using sample data).
My guess, however, is that your DimDate table is a standard date table (with one row per date), but you want standard deviation by month.
The FILTER statement in your formula limits the date range to the prior 6 full months correctly, but it will still have one row per date. You can confirm this in Power BI by going into the Data View, selecting 'New Table' under Modeling on the ribbon, and putting your FILTER statement in:
Table = FILTER(ALL(DimDate),
DimDate[MonthID]<=(MAX(DimDate[MonthID])-1) &&
DimDate[MonthID]>=(MAX(DimDate[MonthID])-6))
Assuming you have more than one day of sales for a given month, calculating the variance by day rather than by month is going to mess things up.
What I'd suggest trying:
Table = FILTER(SUMMARIZE(ALL(DimDate),[MonthID]),
DimDate[MonthID]<=(MAX(DimDate[MonthID])-1) &&
DimDate[MonthID]>=(MAX(DimDate[MonthID])-6))
The additional SUMMARIZE statement means that you only get one row for each MonthID, rather than 1 row for each date. If your [6M Sales] is the monthly average across all 6 months, and [SumOfOrderQuantity] is the monthly sum for each month, then you should be set to go calculating the variance, squaring, dividing by 6, and square rooting.
If you need to do further troubleshooting, remember you can put a table on your canvas with MonthID, SumOfOrderQuantity and [6M Sales] and compare the numbers you expect at each stage of the calculation with the numbers you're seeing.
Hope this helps.

I was facing a similar problem while trying to calculate the coefficient of variation (Std. /Mean) by SKUS from sales data. I could use the Pivot-Unpivot function in Power Query editor to to do away with the problem of months with missing sales:
1) Export the data with any calculated columns
2) Reimport the data so that the calculated columns are also available in the power query editor
3) Pivoted the data by months
4) Replaced null values with 0s
5) Unpivoted the data
6) Close and apply the query
7) Add a calculated column for the coefficient of variation using the formula 
CV = CALCULATE(STDEV.P(Table1[Value]),ALLEXCEPT(Table1,Table1[Product]))/CALCULATE(AVERAGE(Table1[Value]),ALLEXCEPT(Table1,Table1[Product]))
Thus zero sales for the missing months will also be considered both for Standard Deviation and Mean.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js