Quartiles for Calculations - powerbi

I am working to recreate a capabilities analysis in Power BI visualizations. As there are no real stats capabilities I am having to create from scratch. I am running into a little issue when I try to get the average of my 3rd quartile for a "BSTP" Best Short Term Performance measurement. The formula for calculating BSTP is
Goal = Baseline (µ) + |0.7 x (BSTP - Baseline (µ))|
(µ) is the AVERAGE() that is easy check
0.7 = variance of the process again easy and check
BSTP = 3rd Quartile average / count of data points in 3rd Quartile not so easy and no check...
For calculation example...
Lest say we have 100 data points equally distributed from 0 - 100
Min = 1
Q1 = 25
Q2 = 50
Q3 = 75
Q4 = 100
This process BSTP calculation from excel would be as follows
SUMIF("data value" >= 75 (Q3)/ COUNTIF("data value" >= 75)
So if the "data value" is greaterthan or equal to 75 sum then divide by the count of data that are greaterthan or equal to 75... In this example we would have 25 data values between 51-75 that equal 1575, 51+52+53+54.... There were 25 data points summed up so the final calculation would be
1575/25 = 63 (BSTP)
to sum up... This is the formula I am trying to solve for in power bi
SUMIF("data value" >= 75 (Q3)/ COUNTIF("data value" >= 75)
I appreciate your insights!

DAX does have some stats functions, so this one is pretty simple.
If you're looking for the average of the 3rd quartile, then you want
SUMIF( 50 <= "data value" < 75 ) / COUNTIF( 50 <= "data value" < 75 )
or simply
AVERAGEIF( 50 <= "data value" < 75 )
You can use the percentile functions calculate where your quartiles lie an then average over that subset:
3rdQtlAvg =
VAR Q2 = PERCENTILE.INC ( Table1[Val], 0.50 )
VAR Q3 = PERCENTILE.EXC ( Table1[Val], 0.75 )
RETURN
AVERAGEX (
FILTER ( Table1,
Table1[Val] >= Q2 &&
Table1[Val] <= Q3 ),
Table1[Val]
)

Related

SUM of Column Virtual Table

I am facing this problem that I cannot solve.
I have this virtual table in power bi Desktop (made in excel just for you) where the last column should give me as a subtotal the sum of the TOTAL column, instead it gives me the multiplication between AVAILABLE * AVERAGE PRICE as it is the formula used for each row
I list the formulas I used:
AVERAGE PRICE = divide (SUMX (zzzAcqVen1, zzzAcqVen1 [ACQ AMOUNT AT COST]), SUMX (zzzAcqVen1, zzzAcqVen1 [QUANTITY PURCHASED]))
AMOUNT INPUT = SUM (zzzAcqVen1 [Q. INPUT]) * [AVERAGE PRICE]
https://i.stack.imgur.com/OJjY1.jpg
in red the wrong result
in green right result
Thanks everyone
Hello You can use this measure:
Calculate the AVERAGE_PRICE as Calculated Column.
AVERAGE_PRICE =
DIVIDE ( [ACQ AMOUNT AT COST], [QUANTITY PURCHASED] )
Then Calculate the AMOUNT_INPUT AS measure;
AMOUNT_INPUT = SUMX(zzzAcqVen1, [Q. INPUT] * [AVERAGE_PRICE] )

Power BI: Percentage of total over multiple sliced categories

I have data in the following format:
Category1
Category2
Category3
...
Amount
A
D
X
...
2
A
E
X
...
5
A
E
Y
...
1
B
D
Z
...
10
B
F
X
...
2
I want to be able to show (in a card, probably) the amount of Category1 with value A (this category has only 2 values) as a percentage of all. But I have slicers/controls on all the other categories and I want this to reflect the values set on those at any time. There are quite a few categories.
For example, when no controls are applied, the sum of amounts with Category1=A is 8 and the overall total is 20, so the percentage would be 40%. With Category3 set to 'X', the sum of Category1=A is 7 and the overall total is 9, so the percentage would be 78%. But I want this to work with any combination of one or more slicers over many categories.
I've found this on SO - Percentage value of a segment against segment total Power BI(DAX) - and tried creating a helper column to give the Amount where Category1=A and 0 if not. I've then tried to create a measure, using (pseudocode):
%Amount = DIVIDE ( SUM ( Table[HelperColumn] ), CALCULATE ( SUM (Table[Amount]), ALLEXCEPT(Table[Category1]) ) )
but it's showing 0%.
I'm new to Power BI and DAX and trying to replicate something I've done in another system, so I don't really understand what that formula is doing and I'm not really sure where to go from here.
I created 2 dax:
First is numerator(used to calculate stack amount based on probability of all 3 categories)Please refer SS attached
numerator =
var cat3Selected = SELECTEDVALUE(Stack[Category3])
var cat2Selected = SELECTEDVALUE(Stack[Category2])
var cat1Total =
CALCULATE(
IF(and(ISBLANK(cat2Selected),ISBLANK(cat3Selected)),
sum(Stack[Amount]),
IF(and(not(ISBLANK(cat2Selected)), not(ISBLANK(cat3Selected))),
CALCULATE(SUM(Stack[Amount]),Stack[Category2] = cat2Selected && Stack[Category3]=cat3Selected),
IF(NOT(ISBLANK(cat2Selected)),
CALCULATE(sum(Stack[Amount]),Stack[Category2]=cat2Selected),
CALCULATE(sum(Stack[Amount]),Stack[Category3]=cat3Selected)
)
))
)
return cat1Total
Second (Denominator ) is used to calculate stack amount for all category
*Denominator = CALCULATE(SUM(Stack[Amount]),ALL(Stack[Category1]))*
& then calculate % ,
*Percentage = DIVIDE(Stack[check msr],[normal all mSR],0)*
attached some screenshot for your reference:

PowerBI: how to get average of sum of multiple categories

For a report, I would like to have the average of a sum over 2 categories (date_type_project & ID).
In PowerBI, I would like to have 2 cards with 1 number. There'll be cards with the numbers:
data_type_project - non-standard, 30.44
data_type_project - standard, 9.47
The numbers are composed as follow: (29 + 33 + 20.25 + 39.5) / 4 = 30.44
and (3.5 + 25.5 + 9.75 + 17.5 + 17.25 + 14.5 + 1.5 + 1.25 + 4.5) / 9 = 9.47
How can I calculate this number in DAX?
You could try the following and use ISINSCOPE() function:
measure_hours =
-- will check, if 'Table'[ID] is in current filter context
var cond = ISINSCOPE('Table'[ID])
RETURN
IF(
cond,
SUM('Table'[hours]),
AVERAGE('Table'[hours])
)
You can create a measure and place it in place of hours
Measure = SUM([Hours])/COUNT([ID])
I've found a workaround that is answering my question: I've created a measure. Then, put it in a card visual and set the legend to 'data_type_project'. Thank you for the help! :)
measure =
AVERAGEX(
KEEPFILTERS(VALUES('Table'[ID])),
CALCULATE(SUM('Table'[hours]))
)

PowerBI Dynamic binning (ranges change) based on value of measure

I’m trying to represent some continuous data via binning. Continuous weighting data of an area should be binned as: VeryHigh, High, Low, VeryLow. The weighting values are based on an interaction between certain Types of events grouped by an Area and so can change depending on the Type selected by the report user.
I have included some sample data below and an outline of what’s been done so far.
Start with five sets of area data (A-E). Within each is one or more incident Types. Each incident has a Weighting and the number of times (Count) it occurs within the Area.
Add a calculated column CC_ALL_WGT (weighting * count)
Create a measure:
M_WGT = DIVIDE(SUM(sample_data[CC_ALL_WGT]), SUM(sample_data[4_count]))
This makes sense once grouped by Area and we can see that the Area gets an overall Weighting Score
This can be altered by slicing the data based on which Type of incident we wish to inspect:
We can also set up additional measures to get the Min; Max; Median from the Measure based on the Type selection:
M_MIN_M_WGT = IF(
countrows(values(sample_data[1_area])) = 1,
sample_data[M_WGT],
MINX(
values(sample_data[1_area]),
sample_data[M_WGT]
)
)
Which change as expected when a Slicer selection is made
Also set up a measure to determine the Mid-Point between the Minimum and the Median and Mid-Point between the Maximum and the Median
M_MidMinMed =
sample_data[M_MED_M_WGT] - ((sample_data[M_MED_M_WGT] - sample_data[M_MIN_M_WGT]) / 2)
What I would like to do with these values is create a banding based on the following:
VeryLow: (Minimum to MinMed mid-point)
Low: (MinMed to Median)
High: (Median to MedMax mid-point)
VeryHigh: (MedMax to Maximum)
So based on the following selection
The bins would be set up as follows
VeryLow (0.59 to 0.76)
Low (0.76 to 0.93)
High (0.93 to 1.01)
VeryHigh (1.01 to 1.1)
Area A would be in Bin 4 (VeryHigh); Area B in Bin 2 (Low); Area C in Bin 1 (VeryLow); Area D in Bin 2 (Low); Area E in Bin 4 (VeryHigh)
If select specific Types to review (via the slicer) the bins would be set up as follows:
VeryLow (0.35 to 0.61)
Low (0.61 to 0.88)
High (0.88 to 1.06)
VeryHigh (1.06 to 1.24)
So checking M_WGT (with types specified in the slicer):
Area A would be in Bin 4 (VeryHigh); Area B in Bin 2 (Low); Area C in Bin 1 (VeryLow); Area D in Bin 1 (VeryLow); Area E in Bin 4 (High)
NOTE - The change in bin classification for Area D from Low to VeryLow
This is where I get stuck. This post specifies how to apply a static bin range: https://community.powerbi.com/t5/Desktop/Histogram-User-defined-bin-size/m-p/69854#M28961 but I’ve not been able to do this using dynamic or changing values (the Min; Max; Media; Midpoint) depending on selection.
The closest I’ve managed to apply is as follows:
Range =
VAR temp =
CALCULATE ( sample_data[M_WGT] )
RETURN
IF (
temp < 0.76,
"1_VeryLow",
IF (
AND ( temp > 0.76, temp <= 0.93 ),
"2_Low",
IF (
AND ( temp > 0.93, temp <= 1.01 ),
"3_High",
"4_VeryHigh"
)
)
)
Which permitted the following:
While I can then associate the Bins with a visual there are a number of things wrong with it. Firstly binning is occurring at the TYPE level not the AREA level. Secondly I’m manually setting the range values.
When I say Type levels what I mean is that they’re being binned at this level:
Whereas what I would like the histogram to be representing are the M_WGT values at the Area level.
If I slice by Area A only the problem is easier to see:
What would I like is for there to be one representation of Area A in the histogram (the bin for 1.10), not the three currently being shown (for each Type 1.9; 1; 0.35)
Hopefully I’ve managed to convey the problem and requirement.
Appreciate any advice or insight.
EDIT:
Link to Report + Data source is here: https://www.dropbox.com/sh/oganwruacdzgtzm/AABlggr3-xqdMvPjuR9EyrMaa?dl=0
You can define the bucket for an area all in a single measure:
Bucket =
VAR Weights =
SUMMARIZE ( ALLSELECTED ( sample_data ), sample_data[1_area], "Wgt", [M_WGT] )
VAR MinW = MINX ( Weights, [Wgt] )
VAR MaxW = MAXX ( Weights, [Wgt] )
VAR MedW = MEDIANX ( Weights, [Wgt] )
VAR MinMedW = ( MinW + MedW ) / 2
VAR MedMaxW = ( MedW + MaxW ) / 2
VAR CurrW = CALCULATE( [M_WGT], ALLSELECTED( sample_data[2_type] ) )
RETURN
SWITCH (
TRUE (),
CurrW <= MinMedW, "1_VeryLow",
CurrW <= MedW, "2_Low",
CurrW <= MedMaxW, "3_High",
CurrW <= MaxW, "4_VeryHigh"
)
This summarizes the weights over everything within your filter selections (ALLSELECTED) and then defines your boundaries as you specified. Then we calculate the weight for the current area across all selected types and pass that into the switch where we check the values from low to high.
Now you can't use a measure as an axis for a chart so if you want these buckets on the axis, I'd recommend defining an independent table.
Ranges =
DATATABLE (
"Range", STRING,
{
{ "1_VeryLow" },
{ "2_Low" },
{ "3_High" },
{ "4_VeryHigh" }
}
)
Put Ranges[Range] on the axis and define a counting measure as appropriate.
CountArea =
COUNTROWS ( FILTER ( sample_data, [Range] = SELECTEDVALUE ( Ranges[Range] ) ) )
I don't really know what you're trying to count, whether it should be a distinct count, or if 4_count should be involved or not but modify this counting measure as necessary.

Calculating proportion with negative float values

I'd like to know if there's any way in C++ to calculate a proportion involving possibily negative values in both vars and extremes.
My goal is to sync a float text input widget with fixed extremes ( eg the user can input any double value between A (min) and B (max) with A,B=any_constant_real_number ) with a slider who can only slide between 0 and 100 ( to simplify ).
If A and B are positive everything is trivial. as
val_slider = ((val_textin-A)*100)/(B-A)
but as A and B can be assumed real it looks to me the only possibility is to use several if/cases, or huge formulas involving a lot of abs() and checks over 0-divisions, whose are quite error prone and very cost intense compared to such an easy task.
Is there any faster and shorter way to achieve the same in c/c++/stl?
Pardon my bad english. Any hint? Thank you.
Your formula should work fine with negative values of A and B as well, as log as A < B.
Example, if you want the user to be able to enter values from -100 to 100, and map these to a slider which goes from 0 - 100, when the user enters -90 you get:
((-90 - A) * 100) / (B - A) = ((-90 - (-100)) * 100) / (100 - (-100))
= 10 * 100 / 200
= 5
An input value of 50 results in a slider value of:
((50 - A) * 100) / (B - A) = ((50 - (-100)) * 100) / (100 - (-100))
= 150 * 100 / 200
= 75
I don't know C++, but I do know Math, so try:
val_slider = 100 * ( val_textin - A ) / ( B - A )
Hey wait. That's exactly what you have. Test case..
A=-200, B=+200, val_texin = 100 (75% of bar, right?)
val_slider = 100 * ( 100 - -200 ) / ( 200 - - 200 )
= ( 300 ) / ( 400 ) * 100
= 75
See, you got it right. The only thing that COULD happen is B==A, but that can't be accounted for with math and requires a single IF. If they are equal, val_slider is exactly B (or A, as they are equal).