PowerBI: Calculate difference between columns - powerbi

I'm struggling with measures and I would appreciate any help.
I have a table with columns: CategoryName, CaseID, CaseOrder and Value
For example:
CategoryName CaseID CaseOrder Value
A C1 2 10
B C1 2 20
C C1 2 30
A C2 3 15
C C2 3 25
A C3 1 10
B C3 1 15
C C3 1 10
I want to sum up for the different CaseID (sort by CaseOrder. I've already sort CaseID column by CaseOrder in Data/Modelling) and CategoryName and, calculate the difference between the total value for each case and CaseOrder = 1.
(obs: in this particular example, there is only one record for the same CaseID and Category, so the "sum" is useless, but in real data, I need to sum up by CategoryName)
What I would like to see as result (I'm using Matrix component to see the data):
CategoryName C3 C1 C2
Value Diff Value Diff Value Diff
A 10 0 10 0 15 5
B 15 0 20 5 -15
C 10 0 20 10 25 15
I've tried PowerBI: how do you calculate the difference between two columns in matrix but it does not work. In this particular example, the Diff column for C1 and C2 cases returns exact the same value of "Value" column and not the difference.
It is important to say that CaseID may be 3, 4, 5, ... different cases denpending on the data I'm importing.

Here's the result I was able to get:
Using this measure:
Diff =
var CaseOrderValue = CALCULATE(MAX(Table1[Value]), ALLEXCEPT(Table1,Table1[CategoryName]),Table1[CaseOrder] = "1")
return Max(Table1[Value]) - CaseOrderValue
You might get some weird stuff going on if you try to aggregate the values -- this is designed to work at the finest level of detail.

Related

Get the max of the average for each group

have the following table :
EmpId DeptId WeekNumber Month NumberofCalls
1 3 4 1 34
2 3 2 3 59
I created a measure to calculate the average of number of calls :
AvgCalls = AVG('MyTable'[NumberofCalls])
now I want to get the max average calls by month, week.
I will be having 3 filters :
Month
Week
Once I select all of them, the result in the histogram bar will be the employee having the max average calls.
Once I select the Month and the Week I want the histogram to display the code of the Employee (W1,W2,W3...) having the maximum average, in my case I get the following result all the employees but not the employee having the max average.
Here is my solution:
I tested it with some random datasets, Here is my data:
EmpId DeptId WeekNumber Month NumberofCalls
Emp01 3 W4 1 34
Emp01 3 W2 3 59
Emp02 3 W5 4 68
Emp02 3 W6 4 76
Emp03 3 W10 5 90
Emp04 4 W10 6 98
Emp04 4 W11 6 45
Emp05 4 W12 7 56
Emp06 4 W13 7 23
Emp07 4 W15 9 45
Emp08 4 W34 8 56
Emp09 4 W52 8 44
Emp05 4 W36 9 23
Emp01 4 W17 10 51
Emp02 4 W23 9 67
Emp06 4 W29 11 28
Emp05 4 W34 12 34
Emp07 4 W41 11 21
Emp04 4 W37 12 33
I wrote this measure using Iterator Function (ADDCOLUMNS):
MaxAverageEmployer =
VAR TAvgCalls =
ADDCOLUMNS(
SUMMARIZE(MyTable,MyTable[EmpId],MyTable[Month ],MyTable[WeekNumber ]),
"AvgCall",CALCULATE(AVERAGE('MyTable'[NumberofCalls]))
)
VAR TMaxAvgCalls =
ADDCOLUMNS(
TAvgCalls,
"MaxAvg",CALCULATE(MAXX(TAvgCalls,[AvgCall]))
)
VAR MaxEmpID =
ADDCOLUMNS(
TMaxAvgCalls,
"MaxEmp",CALCULATE(VALUES(MyTable[EmpId]),FILTER(TMaxAvgCalls,[AvgCall] = [MaxAvg]))
)
RETURN
MAXX(MaxEmpID,[MaxEmp])
Here is the part:
It showed nothing when I tried to show it on histogram (or Bar Chart Visual); but It gave me correct values on a table visual:
WeekNumber : I put in on Rows
MonthNumber : I put it on Slicer to filter it!
Here is the final solution, and I hope It is what you are looking for!

PowerBI Matrix Average instead of Subtotal and Conditional Formatting According to That Average

Hello I am just new in powerBI and it is still hard to work on for me.
I have a matrix like that
DATE Sales Refund
26 Agu 45 5
p1 10 3
p2 15 2
p3 20 0
27 Agu 60 1
p1 15 1
p2 20 0
p3 25 0
In the date parts I have subtotals as it normally does. However, I want to show the average of that day there and when I get the average I will make conditional formatting according to it. If a cell is below average I will mark it with red point and in refunds I will do it for the values above the average.
Is there a way to do that. I searched for it for awhile but could not find.
The output I want is like that. (star is for red point.)
DATE Sales Refund
26 Agu 15 1.66
p1 10* 3*
p2 15 2*
p3 20 0
27 Agu 20 0.33
p1 15* 1*
p2 20 0
p3 25 0
Thanks.
You can colour the background; For example, create this measure:
AVG =
IF( SELECTEDVALUE(RefundTab[Sale] ) < CALCULATE(AVERAGE(RefundTab[Sale]), ALL(RefundTab[Code])),0,1)
From menu -> Conditional formatting -> Background color:
And here:
OR
you can create measure where we return string instead of number where we put some unicode value:
SumSaleIf =
var _sale = sum(RefundTab[Sale])
var _IfAVG = CALCULATE(AVERAGE(RefundTab[Sale]), ALL(RefundTab[Code]))
var _check = if(_sale < _IfAVG, _sale & UNICHAR(128315), _sale &"")
return _check

openoffice calc sumproduct with a twist

my first attempt in VBA apart from using simple functions; asking for a kick start here:
assume this (part of a) sheet
factor b-count c-count d-count
A2 b2 c2 d2 ...
A3 b3 c3 d3 ...
Assume that these are the first columns and rows A1 to D3, holding numeric values each.
If factor is 1, I want A(N) (column 'A', row N >= 2) to hold the sumproduct of row 1 and row N.
The twist comes when factor is not 1. In that case I want a sumproduct of
count*round(value * factor).
Example:
1.5 2 1 0 4
=myfunc(2) 4 8 11 15
=myfunc(3) 11 20 28 36
=myfunc(4) 29 53 74 94
where myfunc(2) should result in
round(4*1,5)*2+round(8*1,5)*1+round(15*1,5)*4 = 6*2+12*1+23*4 = 12+12+92 = 116, myfunc(3) = 17*2+30+54*4 = 34+30+216 = 280, myfunc(4) = 44*2+80+141*4 = 88+80+564 = 732 etc.
I could just insert a row below each one, multiplying every value with the factor; but I would love something fancier.
basically thought (pun not intended):
col='B'
sum=0
do while (col)(N)>0
sum=sum+(col)(1)*round((col)(N)*A1;0)
col=col+1
loop
A(n)=sum
where (col)(N) refers to the cell in column col and row N.
Not important enough to study the manual; but it would be great if someone can do this off the cuff.
Another point: I have read that custom functions must be stored in the "Standard Library";
but I could not find any mention on HOW to do that. Who will point me to the right manual page?
Go to Tools -> Macros -> Organize Macros -> OpenOffice Basic. Select My Macros -> Standard -> Module 1 (that is what is meant by the Standard library), and press Edit.
Paste the following code.
Function SumProductOfTwoRows(firstColumn As Long, row As Long, firstRow As Long)
'For example: =SUMPRODUCTOFTWOROWS(COLUMN(); ROW(); ROW($A$1))
firstColumn = firstColumn - 1 'column A is index 0
row = row - 1 'row 1 is index 0
firstRow = firstRow - 1 'row 1 is index 0
oSheet = ThisComponent.CurrentController.ActiveSheet
sum = 0
column = firstColumn + 1
factor = oSheet.getCellByPosition(firstColumn, firstRow).getValue()
Do
value = oSheet.getCellByPosition(column, row).getValue()
count = oSheet.getCellByPosition(column, firstRow).getValue()
If value = 0 Then Exit Do
sum = sum + count * CLng(value * factor)
column = column + 1
Loop
SumProductOfTwoRows = sum
End Function
Enter this formula in A2 and drag to fill down to A4.
=SUMPRODUCTOFTWOROWS(COLUMN(); ROW(); ROW($A$1))
The result:
This kind of user-defined function produces an error when re-opening the file. To avoid the error, see my answer at https://stackoverflow.com/a/39254907/5100564.

SAS - Selecting optimal quantities

I'm trying to solve a problem in SAS where I have quantities of customers across a range of groups, and the quantities I select need to be as even across the different categories as possible. This will be easier to explain with a small table, which is a simplification of a much larger problem I'm trying to solve.
Here is the table:
Customer Category | Revenue band | Churn Band | # Customers
A 1 1 4895
A 1 2 383
A 1 3 222
A 2 1 28
A 2 2 2828
A 2 3 232
B 1 1 4454
B 1 2 545
B 1 3 454
B 2 1 4534
B 2 2 434
B 2 3 454
Suppose I need to select 3000 customers from category A, and 3000 customers from category B. From the second category, within each A and B, I need to select an equal amount from 1 and 2. If possible, I need to select a proportional amount across each 1, 2, and 3 subcategories. Is there an elegant solution to this problem? I'm relatively new to SAS and so far I've investigated OPTMODEL, but the examples are either too simple or too advanced to be much use to me yet.
Edit: I've thought about using survey select. I can use this to select equal sizes across the Revenue Bands 1, 2, and 3. However where I'm lacking customers in the individual churn bands, surveyselect may not select the maximum number of customers available where those numbers are low, and I'm back to manually selecting customers.
There are still some ambiguities in the problem statement, but I hope that the PROC OPTMODEL code below is a good start for you. I tried to add examples of many different features, so that you can toy around with the model and hopefully get closer to what you actually need.
Of the many things you could optimize, I am minimizing the maximum violation from your "If possible" goal, e.g.:
min MaxMismatch = MaxChurnMismatch;
I was able to model your constraints as a Linear Program, which means that it should scale very well. You probably have other constraints you did not mention, but that would probably beyond the scope of this site.
With the data you posted, you can see from the output of the print statements that the optimal penalty corresponds to choosing 1500 customers from A,1,1, where the ideal would be 1736. This is more expensive than ignoring the customers from several groups:
[1] ChooseByCat
A 3000
B 3000
[1] [2] [3] Choose IdealProportion
A 1 1 1500 1736.670
A 1 2 0 135.882
A 1 3 0 78.762
A 2 1 28 9.934
A 2 2 1240 1003.330
A 2 3 232 82.310
B 1 1 1500 1580.210
B 1 2 0 193.358
B 1 3 0 161.072
B 2 1 1500 1608.593
B 2 2 0 153.976
B 2 3 0 161.072
Proportion MaxChurnMisMatch
0.35478 236.67
That is probably not the ideal solution, but figuring how to model exactly your requirements would not be as useful for this site. You can contact me offline if that is relevant.
I've added quotes from your problem statement as comments in the code below.
Have fun!
data custCounts;
input cat $ rev churn n;
datalines;
A 1 1 4895
A 1 2 383
A 1 3 222
A 2 1 28
A 2 2 2828
A 2 3 232
B 1 1 4454
B 1 2 545
B 1 3 454
B 2 1 4534
B 2 2 434
B 2 3 454
;
proc optmodel printlevel = 0;
set CATxREVxCHURN init {} inter {<'A',1,1>};
set CAT = setof{<c,r,ch> in CATxREVxCHURN} c;
num n{CATxREVxCHURN};
read data custCounts into CATxREVxCHURN=[cat rev churn] n;
put n[*]=;
var Choose{<c,r,ch> in CATxREVxCHURN} >= 0 <= n[c,r,ch]
, MaxChurnMisMatch >= 0, Proportion >= 0 <= 1
;
/* From OP:
Suppose I need to select 3000 customers from category A,
and 3000 customers from category B. */
num goal = 3000;
/* See "implicit slice" for the parenthesis notation, i.e. (c) below. */
impvar ChooseByCat{c in CAT} =
sum{<(c),r,ch> in CATxREVxCHURN} Choose[c,r,ch];
con MatchCatGoal{c in CAT}:
ChooseByCat[c] = goal;
/* From OP:
From the second category, within each A and B,
I need to select an equal amount from 1 and 2 */
con MatchRevenueGroupsWithinCat{c in CAT}:
sum{<(c),(1),ch> in CATxREVxCHURN} Choose[c,1,ch]
= sum{<(c),(2),ch> in CATxREVxCHURN} Choose[c,2,ch]
;
/* From OP:
If possible, I need to select a proportional amount
across each 1, 2, and 3 subcategories. */
con MatchBandProportion{<c,r,ch> in CATxREVxCHURN, sign in / 1 -1 /}:
MaxChurnMismatch >= sign * ( Choose[c,r,ch] - Proportion * n[c,r,ch] );
min MaxMismatch = MaxChurnMismatch;
solve;
print ChooseByCat;
impvar IdealProportion{<c,r,ch> in CATxREVxCHURN} = Proportion * n[c,r,ch];
print Choose IdealProportion;
print Proportion MaxChurnMismatch;
quit;

How to group data in kdb+ using customized groups?

I have a table (allsales) with a column for time (sale_time). I want to group the data by sale_time. But I want to be able to bucket this. ex any data where time is between 00:00:00-03:00:00 should be grouped together, 03:00:00-06:00:00 should be grouped together and so on. Is there a way to write such a query?
xbar is useful for rounding to interval values e.g.
q)5 xbar 1 3 5 8 10 11 12 14 18
0 0 5 5 10 10 10 10 15
We can then use this to group rows into time groups, for your example:
q)s:([] t:13:00t+00:15t*til 24; v:til 24)
q)s
t v
--------------
13:00:00.000 0
13:15:00.000 1
13:30:00.000 2
13:45:00.000 3
14:00:00.000 4
14:15:00.000 5
..
q)select count i,sum v by xbar[`int$03:00t;t] from s
t | x v
------------| ------
12:00:00.000| 8 28
15:00:00.000| 12 162
18:00:00.000| 4 86
"by xbar[`int$03:00t;t]" rounds the time column t to the nearest three hour value, then this is used as the group by.
There are few more ways to achieve the same results.
q)select count i , sum v by t:01:00u*3 xbar t.hh from s
q)select count i , sum v by t:180 xbar t.minute from s
t | x v
-----| ------
12:00| 8 28
15:00| 12 162
18:00| 4 86
But in all cases, be careful of the date column if present in the table, otherwise same time window across different dates will generate the wrong results.
q)s:([] d:24#2013.05.07 2013.05.08; t:13:00t+00:15t*til 24; v:til 24)
q)select count i , sum v by d, t:180 xbar t.minute from s
d t | x v
----------------| ----
2013.05.07 12:00| 4 12
2013.05.07 15:00| 6 78
2013.05.07 18:00| 2 42
2013.05.08 12:00| 4 16
2013.05.08 15:00| 6 84
2013.05.08 18:00| 2 44