DAX Grouping and Ranking in Calculated Columns

DAX Grouping and Ranking in Calculated Columns - powerbi

My raw data stops at sales - looking for some DAX help adding the last two as calculated columns.
customer_id order_id order_date sales total_sales_by_customer total_sales_customer_rank
------------- ---------- ------------ ------- ------------------------- ---------------------------
BM 1 9/2/2014 476 550 1
BM 2 10/27/2016 25 550 1
BM 3 9/30/2014 49 550 1
RA 4 12/18/2017 47 525 3
RA 5 9/7/2017 478 525 3
RS 6 7/5/2015 5 5 other
JH 7 5/12/2017 6 6 other
AG 8 9/7/2015 7 7 other
SP 9 5/19/2017 26 546 2
SP 10 8/16/2015 520 546 2

Lets start with total sales by customer:
total_sales_by_customer =
var custID = orders[customer_id]
return CALCULATE(SUM(orders[sales], FILTER(orders, custID = orders[customer_id]))
first we get the custID, filter the orders table on this ID and sum it together per customer.
Next the ranking:
total_sales_customer_rank =
var rankMe = RANKX(orders, orders[total_sales_by_customer],,,Dense)
return if (rankMe > 3, "other", CONVERT(rankMe, STRING))
We get the rank per cust sales (gotten from first column), if it is bigger than 3, replace by "other"
On your first question: DAX is not like a programming language. Each row is assessed individual. Lets go with your first row: your custID will be "BM".
Next we calculate the sum of all the sales. We filter the whole table on the custID and sum this together. So in the filter we have actualty only 3 rows!
This is repeated for each row, seems slow but I only told this so you can understand the result you are getting back. In reality there is clever logic to return data fast.
What you want to do "Orders[Customer ID]=Orders[Customer ID]" is not possible because your Orders[Customer ID] is within the filter and will run with the rows..
var custid = VALUES(Orders[Customer ID]) Values is returning a single column table, you can not use this in a filter because you are then comparing a cell value with a table.

Related

Latest values by category based on a selected date

First, as I am a French guy, I want to apologise in advance for my poor English!
Despite my searches since few days, I cannot find the correct measure to solve my problem.
I think I am close to the solution, but I really need help to achieve this job!
Here is my need:
I have a dataset with a date table and a "Position" (i.e. "stock") table, which is my fact table, with date column.
Classic relationship between these 2 tables. Many Dates in "Position" table / 1 date un "Dates" table.
My "Dates" table has a one date per day (Column "AsOf")
My "Deals" table looks like this:
Id
DealId
AsOfDate
Notional
10000
1
9/1/2022
2000000
10001
1
9/1/2022
3000000
10002
1
9/1/2022
1818147
10010
4
5/31/2022
2000000
10011
4
5/31/2022
997500
10012
4
5/31/2022
1500000
10013
4
5/31/2022
1127820
10014
5
7/27/2022
140000
10015
5
7/27/2022
210000
10016
5
7/27/2022
500000
10017
5
7/27/2022
750000
10018
5
7/27/2022
625000
10019
1
8/31/2022
2000000
10020
1
8/31/2022
3000000
10021
1
8/31/2022
1801257
10022
1
8/31/2022
96976
10023
1
8/31/2022
1193365
10024
1
8/31/2022
67883
Based on a selected date (slicer with all dates from "Dates" table), I would like to calculate the sum of Last Notional for each "Deal" (column "DealId").
So, I must identify, for each Deal, the last "Asof Date" before or equal to the selected date and sum all matching rows.
Examples:
If selected date is 9/1/2022, I will see all rows, except rows asof date = 8/31/2022 for deal 1 (as the last date for this deal is 9/1/2022).
So, I expect to see:
DealId Sum of Notional
1 6 818 147
4 5 625 320
5 2 225 000
Grand Total 14 668 467
If I select 8/31/2022, total for Deal 1 changes (as we now take rows of 8/31 instead of 1/9):
DealId Sum of Notional
1 8 159 481
4 5 625 320
5 2 225 000
Grand Total 16 009 800
If I select 7/29, only deals 4 and 5 are active on this date, so the results should be:
DealId Sum of Notional
4 5 625 320
5 2 225 000
Grand Total 7 850 320
I think I found a solution for the rows, but my total is wrong (only notionals of the selected date are totalized).
I also think my measure is incorrect if I try to display the notional amounts aggregated by Rating (other column in my table) instead of deal.
Here is my measure:
Last Notional =
VAR SelectedAsOf =
SELECTEDVALUE ( Dates[AsOf] )
VAR LastAsofPerDeal =
CALCULATE (
MAX ( Deals[AsOf Date] ),
FILTER ( ALLEXCEPT ( Deals, Deals[DealId] ), Deals[AsOf Date] <= SelectedAsOf )
)
RETURN
CALCULATE (
SUM ( Deals[Notional] ),
FILTER (
ALLEXCEPT ( Deals, Deals[DealId]),
LastAsofPerDeal = Deals[AsOf Date]
)
)
I hope it is clear for you, and you will be able to find a solution for this.
Thanks in advance.
Antoine

Make sure you have no relationship between your calendar table and deals table like so.
Create a slicer with your dates table and create a table visual with deal id. Then add a measure to the table as follows:
Sum of Notional =
VAR slicer = SELECTEDVALUE(Dates[Date])
VAR tbl = FILTER(Deals,Deals[AsOfDate] <= slicer)
VAR maxBalanceDate = CALCULATE(MAX(Deals[AsOfDate]),tbl)
RETURN
CALCULATE(
SUM(Deals[Notional]),
Deals[AsOfDate] = maxBalanceDate
)

Google Sheet Function with IF statement to add 1/0 column

I want to query a number of rows from one sheet into another sheet, and to the right of this row add a column based on one of the queried columns. Meaning that if column C is "Il", I want to add a column to show 0, otherwise 1 (the samples below will make it clearer.
I have tried doing this with Query and Arrayformula, without query, with Filter and importrange. An example of what I tried:
=query(Data!A1:AG,"Select D, E, J, E-J, Q, AG " & IF(AG="Il",0, 1),1)
Raw data sample:
Captured Amount Fee Country
TRUE 336 10.04 NZ
TRUE 37 1.37 GB
TRUE 150 4.65 US
TRUE 45 1.61 US
TRUE 20 0.88 IL
What I would want as a result:
Amount Fee Country Sort
336 10.04 NZ 1
37 1.37 GB 1
150 4.65 US 1
45 1.61 US 1
20 0.88 IL 0

try it like this:
=ARRAYFORMULA(QUERY({Data!A1:Q, {"Sort"; IF(Data!AG2:AG="IL", 0, 1)}},
"select Col4,Col5,Col9,Col5-Col9,Col17,Col18 label Col5-Col9''", 1))

PowerBI running Total formula

I have a dataset OvertimeHours with EMPLID, checkdate and NumberOfHours (and other fields). I need a running total NumberOfHours for each employee by checkdate. I tried using the Quick Measure option but that only allows for a single column and I have two. I do not want the measure to recalculate when filters are applied. Ultimately what I am trying to do is identify the records for the first 6 hours of overtime worked on each check so that they can get a category of OCB and all overtime over the first 6 hours is OTP and it does not have to be exact (as demonstrated in the output below). I have only been working with Power BI for about a month and this is a pretty complex (for me) formula to figure out...
EMPLID CheckDate WkDate NumberOfHours RunningTotal Category
124 1/1/19 12/20/18 5 5 OCB
124 1/1/19 12/21/18 9 14 OTP
125 1/1/19 12/20/18 3 3 OCB
125 1/1/19 12/20/18 2 5 OCB
125 1/1/19 12/22/18 2 7 OTP
124 1/15/19 1/8/19 3 3 OCB
*Edited to add the WkDate.
Edit:
I have tweaked my query so that I have the running total and a sequential counter now:
Using the first 12 records, I am looking to get the following results:
I can either do it in a query if that is the easiest way or if there is a way to use DAX in PowerBI with this dataset now that I have the sequential piece, I can do that too.

I got it in the query:
select r.CheckDate,
r.EMPLID,
case
when PayrollRunningOTHours <= 6
then PayrollRunningOTHours
else 6
end as OCBHours,
case
when PayRollRunningOTHours > 6
then PayRollRunningOTHours - 6
end as OTPHours
from #rollingtotal r
inner
join lastone l
on r.CheckDate = l.CheckDate
and r.EMPLID = l.EMPLID
and r.OTCounter = l.lastRec
order by r.emplid,
r.CheckDate,
r.OTCounter

Average of percent of column totals in DAX

I have a fact table named meetings containing the following:
- staff
- minutes
- type
I then created a summarized table with the following:
TableA =
SUMMARIZECOLUMNS (
'meetings'[staff]
, 'meetings'[type]
, "SumMinutesByStaffAndType", SUM( 'meetings'[minutes] )
)
This makes a pivot table with staff as rows and columns as types.
For this pivottable I need to calculate each cell as a percent of the column total. For each staff I need the average of their percents. There are only 5 meeting types so I need the sum of these percents divided by 5.
I don't know how to divide one number grouped by two columns by another number grouped by one column. I'm coming from the SQL world so my DAX is terrible and I'm desperate for advice.
I tried creating another summarized table to get the sum of minutes for each type.
TableB =
SUMMARIZECOLUMNS (
'meetings'[type]
, "SumMinutesByType", SUM( 'meetings'[minutes] )
)
From there I want 'TableA'[SumMinutesByStaffAndType] / 'TableB'[SumMinutesByType].
TableC =
SUMMARIZECOLUMNS (
'TableA'[staff],
'TableB'[type],
DIVIDE ( 'TableA'[SumMinutesByType], 'TableB'[SumMinutesByType]
)
"A single value for column 'Minutes' in table 'Min by Staff-Contact' cannot be determined. This can happen when a measure formula refers to a column that contains many values without specifying an aggregation such as min, max, count, or sum to get a single result."
I keep arriving at this error which leads me to believe I'm not going about this the "Power BI way".
I have tried making measures and creating matrices on the reports view. I've tried using the group by feature in the Query Editor. I even tried both measures and aggregate tables. I'm likely overcomplicating it and way off the mark so any help is greatly appreciated.
Here's an example of what I'm trying to do.
## Input/First table
staff minutes type
--------- --------- -----------
Bill 5 TELEPHONE
Bill 10 FACE2FACE
Bill 5 INDIRECT
Bill 5 EMAIL
Bill 10 OTHER
Gary 10 TELEPHONE
Gary 5 EMAIL
Gary 5 OTHER
Madison 20 FACE2FACE
Madison 5 INDIRECT
Madison 15 EMAIL
Rob 5 FACE2FACE
Rob 5 INDIRECT
Rob 20 TELEPHONE
Rob 45 FACE2FACE
## Second table with SUM of minutes, Grand Total is column total.
Row Labels EMAIL FACE2FACE INDIRECT OTHER TELEPHONE
------------- ------- ----------- ---------- ------- -----------
Bill 5 10 5 10 5
Gary 5 5 10
Madison 15 20 5
Rob 50 5 20
Grand Total 25 80 15 15 35
## Third table where each of the above cells is divided by its column total.
Row Labels EMAIL FACE2FACE INDIRECT OTHER TELEPHONE
------------- ------- ----------- ------------- ------------- -------------
Bill 0.2 0.125 0.333333333 0.666666667 0.142857143
Gary 0.2 0 0 0.333333333 0.285714286
Madison 0.6 0.25 0.333333333 0 0
Rob 0 0.625 0.333333333 0 0.571428571
Grand Total 25 80 15 15 35
## Final table with the sum of the rows in the third table divided by 5.
staff AVERAGE
--------- -------------
Bill 29.35714286
Gary 16.38095238
Madison 23.66666667
Rob 30.5952381
Please let me know if I can clarify an aspect.

You can make use of the built in functions like %Row total in Power BI, Please find the snapshot below
If this is not what you are looking for, kindly let me know (I have used your Input table)

Changing ID from nth to last row if something happens at nth row

My data has some problem. The survey is conducted on housing unit. So the two rows with the same person ID might not actually indicate the same person.
I want to assign different ID for actually different person.
Let's say I have this data.
id yearmonth age
1 200001 12
1 200002 12
1 200003 14
1 200004 14
1 200005 14
3rd row is definitely different person. Its age increase by 2.
So I want to change ID like
id yearmonth age
1 200001 12
1 200002 12
10 200003 14
10 200004 14
10 200005 14
How can I do this? I think I can change the ID of 3rd row by writing
bysort id (yearmonth): replace id=id*10 if age[_n-1]>age+1 | age[_n-1]+1<age
(where I multiply by 10 because all IDs have the same number of numbers, so that multiplying by 10 won't give any duplicate)
But how can I change all subsequent rows?

Building on what you have, something like this might do what you want.
bysort id (yearmonth): generate idchange = age[_n-1]>age+1 | age[_n-1]+1<age
bysort id (yearmonth): generate numchange = sum(idchange)
replace id = 10*id + (idchange-1) if idchange>0
Note that this will handle the case where one original id has two or more changes detected. For up to 10 changes, anyhow.
id yearmonth age
2 200001 12
2 200002 14
2 200003 15
2 200004 18
2 200005 18

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js