Power BI Measure to countrows of related values several related tables deep

Power BI Measure to countrows of related values several related tables deep - powerbi

I need to create a measure for a card that will count the total number of Question Groups that exist for each person using the tables below.
I've tried the following but it's returning the result 10, instead of the expected result which should be 6. (George = 2, Susan = 1, tom = 1, bill=1, sally =1, mark =0, jason=0)
Measure = COUNTROWS(NATURALLEFTOUTERJOIN(NATURALLEFTOUTERJOIN(People,Questions),'Question Groups'))
What am I doing wrong?
Table: People
PeopleID
Name
1
George
2
Susan
3
Tom
4
Bill
5
Sally
6
Mark
7
Jason
Table: relPeopleQuestions
PeopleID
QuestionID
1
1
1
2
1
3
2
4
2
5
3
6
4
7
5
8
Table: Questions
Question ID
Question name
Questiong Group ID
1
How are you?
1
2
Favorite Color?
2
3
Favorite Movie?
2
4
Sister's Name
3
5
Brother's Name
3
6
What is your birthdate?
1
7
What City do you live in?
1
8
Favorite game?
2
Table: Question Groups
Question Group ID
Question Group Name
1
Assorted
2
Favorites
3
Relatives
A working example file can be obtained here.

A distinct count on the Question Group ID from the Questions table would seem to be sufficient, e.g.
MyMeasure =
VAR MyTable =
SUMMARIZE (
People,
People[Name],
"Count", DISTINCTCOUNT ( Questions[Question Group ID] )
)
RETURN
SUMX ( MyTable, [Count] )

Related

DAX equation to average data with different timespans

I have data for different companies. The data stops at day 10 for one of the companies (Company 1), day 6 for the others. If Company 1 is selected with other companies, I want to show the average so that the data runs until day 10, but using day 7, 8, 9, 10 values for Company 1 and day 6 values for others.
I'd want to just fill down days 8-10 for other companies with the day 6 value, but that would look misleading on the graph. So I need a DAX equation with some magic in it.
As an example, I have companies:
Company 1
Company 2
Company 3
etc. as a filter
And a table like:
Company
Date
Day of Month
Count
Company 1
1.11.2022
1
10
Company 1
2.11.2022
2
20
Company 1
3.11.2022
3
21
Company 1
4.11.2022
4
30
Company 1
5.11.2022
5
40
Company 1
6.11.2022
6
50
Company 1
7.11.2022
7
55
Company 1
8.11.2022
8
60
Company 1
9.11.2022
9
62
Company 1
10.11.2022
10
70
Company 1
11.11.2022
11
NULL
Company 2
1.11.2022
1
15
Company 2
2.11.2022
2
25
Company 2
3.11.2022
3
30
Company 2
4.11.2022
4
34
Company 2
5.11.2022
5
45
Company 2
6.11.2022
6
100
Company 2
7.11.2022
7
NULL
Every date has a row, but for days over 6/10 the count is NULL. If Company 1 or Company 2 is chosen separately, I'd like to show the count as is. If they are chosen together, I'd like the average of the two so that:
Day 5: AVG(40,45)
Day 6: AVG(50,100)
Day 7: AVG(55,100)
Day 8: AVG(60,100)
Day 9: AVG(62,100)
Day 10: AVG(70,100)
Any ideas?

You want something like this?
Create a Matriz using your:
company_table_dim (M)
calendar_Days_Table(N)
So you will have a new table of MXN Rows
Go to PowerQuery Order DATA and FillDown your QTY column
(= Table.FillDown(#"Se expandió Fact_Table",{"QTY"}))
So your last known QTY will de filled til the end of Time_Table for any company filters
Cons: Consider your new Matriz MXN it could be millions of rows to calculate
Greetings
enter image description here

Lookup value based on group max value in PowerBI

I would like to first state that I am a beginner with DAX and this is one of my attempts (which seemed to be the closest to the solution I need). I come from a SQL heavy background so my "thinking" is somehow fixed in that way.
I have tried to solve this by implementing something that would match the following SQL logic:
CASE WHEN MAX(column) OVER (PARTITION BY group) = column2 THEN column3 ELSE "" END
However, this doesn't seem to work directly like in SQL, so I would like to ask for some help.
I have the current set of data, which is imported from a simple text file.
ID GroupID Amount
1 2 8502
2 2 8502
3 2 8502
4 2 8502
1 6 80
2 6 80
And I would like to find a way to get the following result:
ID GroupID Amount LatestGroupAmount
1 2 8502
2 2 8502
3 2 8502
4 2 8502 8502
1 6 80
2 6 80 80
And then have a Total under LatestGroupAmount, totaling to 8582.
So far, I have created 2 new measures in my table, MaxID and MaxIDbyGroup.
MaxID = MAX(data[ID])
and
MaxIDbyGroup = CALCULATE([MaxID], ALLEXCEPT(data, data[GroupID]))
This gives me:
ID GroupID Amount MaxID MaxIDbyGroup
1 2 8502 1 4
2 2 8502 2 4
3 2 8502 3 4
4 2 8502 4 4
1 6 80 1 2
2 6 80 2 2
Now, I would like to create a new measure that just does a lookup of the Amount, based on the equality between ID and MaxIDbyGroup.
I have tried to create a new measure with the following definition:
LatestGroupAmount = LOOKUPVALUE(data[Amount], data[GroupId], data[MaxIDbyGroup])
But this gives me the following output:
ID GroupID Amount LatestGroupAmount
1 2 8502
2 2 8502
3 2 8502
4 2 8502
1 6 80 8502
2 6 80 8502
Edit:
I have created another measure:
MaxGrid = MAX(data[GroupID])
And I have tried using CALCULATE with the following definition for LatestGroupAmount:
LatestGroupAmount = CALCULATE(
SUM( data[Amount] ),
FILTER( data, data[ID] = data[MAXID_by_author]), FILTER(data, data[GroupID] = data[MaxGrid]) )
And it seems to show what I want, however, it filters the 6 rows I have to only 2 rows (although I think it does an aggregation).
ID GroupID Amount LatestGroupAmount
4 2 8502 8502
2 6 80 80
The reason I say I think it's an aggregation, is because I add the MaxID to the widget, the output shows the correct number of rows. Essentially, the image below is the output that I want, except for the MaxID column.
If I remove the MaxID column, the widget automatically summarizes to two rows, but I want to show all of the 6 rows.

You can use this measure to achieve your result:
LatestGroupAmount =
VAR TT01 = ADDCOLUMNS(
SUMMARIZE(Data,Data[ID],Data[GroupID]),
"MaxID",CALCULATE(MAX(Data[ID]),ALLEXCEPT(Data,Data[GroupID]))
)
RETURN
CALCULATE(MAX(Data[Amount]),
FILTER(TT01,
Data[ID] =[MaxID]))
Then define your visual table by putting [ID], [GroupID],[Amount] on rows, and above measure into values, Then:
Please Make sure that For [ID] and [GroupID Columns], show items with no data is ticked or checked, like in the picture below.

Your current definition for LatestGroupAmount is searching in the GroupId column, though I believe that should be the ID column, i.e.:
LOOKUPVALUE( data[Amount], data[ID], data[MaxIDbyGroup] )
In any case, this will fail since that column contains duplicate entries. As such, you should use something like:
LatestGroupAmount :=
CALCULATE(
MAX( data[Amount] ),
FILTER( data, data[Id] = data[MaxIDbyGroup] )
)

PowerBI DAX - Sum table by criteria and date

relatively new to PowerBI/PowerQuery/DAX and have become stuck at the following problem. I am unsure what road to go down to get the best outcome and would appreciate any help.
My data table is connected to a time tracking application. A User will enter a time entry everytime they complete a task. The task can be either a Project task or an Admin task. When selecting either of these, there will be multiple sub-categories beneath each, each with its own ID. This translates to my table as the following :
User ProjectID AdminID Hours Date
John 1 2 01/01/22
John 11 1 01/01/22
John 4 1 01/01/22
John 12 3 01/01/22
John 13 1 01/01/22
Pete 7 1 01/01/22
Pete 2 4 01/01/22
Pete 3 2 01/01/22
Mike 1 6 01/01/22
Mike 9 1 01/01/22
Mike 10 1 01/01/22
My objective is, for each Date in the table, to calculate the total hours spent either doing Project tasks or Admin tasks. I am not concerned about the specific breakdown (ie the sum of the unique IDs), rather the overall total. The above example covers just one day, in reality my data covers multiple years. My expected output will look like this :
User TotalProject TotalAdmin Date
John 3 5 01/01/22
John 3 4 01/02/22
John 5 2 01/03/22
Pete 5 1 01/01/22
Pete 1 8 01/02/22
Pete 6 2 01/03/22
Mike 6 2 01/01/22
Mike 6 1 01/02/22
Mike 7 2 01/03/22
I am unsure the best method to achieve this - either by creating some kind of column in the table through PowerQuery? Or a calculated column using DAX? And if so, what the SUM syntax would look like?
Very willing to learn, to any tips would be greatly appreciated!

For your sample input, just create 2 measures.
Total Admin = CALCULATE( SUM('Table'[Hours]), NOT(ISBLANK('Table'[AdminID])))
Total Project = CALCULATE( SUM('Table'[Hours]), NOT(ISBLANK('Table'[ProjectID])))

Power BI DAX Measure works for one column but not another

Using the information below I need to create a new table in DAX called Table (Download a demo file here).
I need to find the location of each employee (column "Name") at the time of the sale date in column "Sale Date" based on their contract details in table DbEmployees. If there is more than one valid contract for a given employee that the sale date fits in, use the shortest contract length.
My problem is that the below measure isn't working to generate column "Location", but it works just fine for column "new value".
Why is this happening and how can it be fixed?
Expected result:
SaleID
EmployeeID
Sale Date
new value
Name
Location
1
45643213
2021-02-04
89067445
Sally Shore
4
2
57647868
2020-04-15
57647868
Paul Bunyon
3
3
89067445
2019-09-24
57647868
Paul Bunyon
6
DbEmployees:
ID
Name
StartDate
EndDate
Location
Position
546465546
Sandra Newman
2021/01/01
2021/12/31
1
Manager
546465546
Sandra Newman
2020/01/01
2020/12/31
2
Clerk
546465546
Sandra Newman
2019/01/01
2019/12/31
3
Clerk
545365743
Paul Bunyon
2021/01/01
2021/12/31
6
Manager
545365743
Paul Bunyon
2020/04/01
2020/05/01
3
Clerk
545365743
Paul Bunyon
2019/04/01
2021/01/01
6
Manager
796423504
Sally Shore
2020/01/01
2020/12/31
4
Clerk
783546053
Jack Tomson
2019/01/01
2019/12/31
2
Manager
DynamicsSales:
SaleID
EmployeeID
Sale Date
1
45643213
2021/02/04
2
57647868
2020/04/15
3
89067445
2019/09/24
DynamicsContacts:
EmployeeID
Name
Email
45643213
Sandra Newman
sandra.newman#hotmail.com
65437658
Jack Tomson
jack.tomson#hotmail.com
57647868
Paul Bunyon
paul.bunyon#hotmail.com
89067445
Sally Shore
sally.shore#hotmail.com
DynamicsAudit:
SaleID
Changed Date
old value
new value
AuditID
Valid Until
1
2019/06/08
65437658
57647868
1
2020-06-07
1
2020/06/07
57647868
89067445
2
2021-05-07
1
2021/05/07
89067445
45643213
3
2021-05-07
2
2019/06/08
65437658
57647868
4
2020-06-07
2
2020/06/07
57647868
89067445
5
2021-05-07
2
2021/05/07
89067445
45643213
6
2021-05-07
3
2019/06/08
65437658
57647868
7
2020-06-07
3
2020/06/07
57647868
89067445
8
2021-05-07
3
2021/05/07
89067445
45643213
9
2021-05-07

From what I can see there are a couple of issues with your formula.
First of all there is no relationship between Table and DbEmployees so when you are filtering exclusively on the dates, which might get you the wrong Location. This can be fixed by changing the formula to:
Location =
VAR CurrentContractDate = [Sale Date]
VAR empName = [Name]
RETURN
VAR RespLocation =
TOPN (
1,
FILTER(DbEmployees, DbEmployees[Name] = empName),
IF (
.....
Secondly, you need to remember that the TOPN function can return multiple rows, from the documentation:
If there is a tie, in order_by values, at the N-th row of the table, then all tied rows are returned. Then, when there are ties at the N-th row the function might return more than n rows.
This can be fixed by picking the Max/Min of the result in the table:
RETURN MAXX(SELECTCOLUMNS( RespLocation,"Location", [Location] ), [Location])
Finally, I don't understand why the last row on the expected result should be a 3, given that the sale date is within a record with location 6.
Full expression:
Location =
VAR CurrentContractDate = [Sale Date]
VAR empName = [Name]
RETURN
VAR RespLocation =
TOPN (
1,
FILTER(DbEmployees, DbEmployees[Name] = empName),
IF (
CurrentContractDate <= DbEmployees[EndDate]
&& CurrentContractDate >= DbEmployees[StartDate], //Check, whether there is matching date
DATEDIFF ( DbEmployees[StartDate], DbEmployees[EndDate], DAY ), //If so, rank matching locations (you may want to employ a different formula)
MIN ( //If the location is not matching, calculate how close it is (from both start and end date)
ABS ( DATEDIFF ( CurrentContractDate, DbEmployees[StartDate], DAY ) ),
ABS ( DATEDIFF ( CurrentContractDate, DbEmployees[EndDate], DAY ) )
) + 1000000 //Add a discriminating factor in case there are matching rows that should be favoured over non-matching.
), 1
)
RETURN
MAXX(SELECTCOLUMNS( RespLocation,"Location", [Location] ), [Location])

Stata: how to duplicate observations under certain conditions

Please help me duplicate a variable under certain conditions? My original dataset looks like this:
week category averageprice
1 1 5
1 2 6
2 1 4
2 2 7
This table says that for each week, there is a unique average price for each category of goods.
I need to create the following variables:
averageprice1 (av. price for category 1)
averageprice2 (av. price for category 2)
such that:
week category averageprice1 averageprice2
1 1 5 6
1 2 5 6
2 1 4 7
2 2 4 7
meaning that for week 1, average price for category 1 stayed at $5, and av. price for cater 2 stayed at 6. Similar logic applies to week 2.
As you could see that the new variables are duplicated depending on a week.
I am still learning Stata. I tried:
bysort week: replace averageprice1=averageprice if categ==1
but it doesn't work as expected.

You are not duplicating observations (meaning here in the Stata sense, i.e. cases or records) here at all, as (1) the number of observations remains the same (2) you are copying certain values, not the contents of observations. Similar comment on "duplicating variables". However, that's just loose use of terminology.
Taking your example very literally
clear
input week category averageprice
1 1 5
1 2 6
2 1 4
2 2 7
end
bysort week (category) : gen averageprice1 = averageprice[1]
by week: gen averageprice2 = averageprice[2]
l
+--------------------------------------------------+
| week category averag~e averag~1 averag~2 |
|--------------------------------------------------|
1. | 1 1 5 5 6 |
2. | 1 2 6 5 6 |
3. | 2 1 4 4 7 |
4. | 2 2 7 4 7 |
+--------------------------------------------------+
This is a standard application of subscripting with by:. Your code didn't work because it did not oblige Stata to look in other observations when that is needed. In fact your use of bysort week did not affect how the code applied at all.
EDIT:
A generalization is
egen averageprice1 = mean(averageprice / (category == 1)), by(week)
egen averageprice2 = mean(averageprice / (category == 2)), by(week)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js