cumulative average powerbi by month - powerbi

I have below dataset.
Math Literature Biology date student
4 2 5 2019-08-25 A
4 5 4 2019-08-08 A
5 4 5 2019-08-23 A
5 5 5 2019-08-15 A
5 5 5 2019-07-19 A
5 5 5 2019-07-15 A
5 5 5 2019-07-03 A
5 5 5 2019-06-26 A
1 1 2 2019-06-18 A
2 3 3 2019-06-14 A
5 5 5 2019-05-01 A
2 1 3 2019-04-26 A
I need to develop a solution in powerbi so in output I have cumulative average per subject per month
For example
April May June July August
Math | 2 3.5 3 3.75 4
Literature | 1 3 3 3.75 3.83
Biology | 3 4 3.6 4.125 4.33
Can you help?

You can use a matrix visualization for this.
Create a month-year variable and use it in the columns.
Use Average of Math,Literature and Biology in values
Under the format pane --> Values --> Show on rows --> Select this
This should give the view you are looking for. You can edit the value headers to your requirement.

Related

Google Sheets formula for summing/averaging with specific conditions

I am hoping for a formula to take hours from the name columns and sum/average them by week, into a separate table like the 2nd one below. The formulas need to update upon changing the start and end week cells.
Body Part
Start Week
End Week
Arnold (hours)
Usain (hours)
Bob (hours)
Arms
1
3
6
3
0
Legs
1
6
12
36
20
Chest
2
4
6
2
2
Booty
4
6
9
12
3
Core
1
5
10
5
5
Formula Needed:
Hours
Arnold
Usian
Bob
Week 1
6
8
4.33
Week 2
8
8.67
5
Week 3
8
8.67
5
Week 4
9
11.67
6
Week 5
7
11
5.33
Week 6
5
10
4.33
Bonus if there is a way to also quickly average hours by body parts if for example there are multiple Arms rows.
try:
=ARRAYFORMULA(LAMBDA(a, b, QUERY(SPLIT(FLATTEN(BYCOL(D1:F1, LAMBDA(xx, FLATTEN(IF(
IF(a>=SEQUENCE(1, MAX(a)), "Week "&TEXT(SEQUENCE(1, MAX(a))+b, "00"), )="",,
REGEXEXTRACT(OFFSET(xx,,,1), "(.+) \(")&"×"&
IF(a>=SEQUENCE(1, MAX(a)), "Week "&TEXT(SEQUENCE(1, MAX(a))+b, "00"), )&"×"&
QUERY({REGEXEXTRACT(OFFSET(xx,,,1), "(.+) \("); OFFSET(xx,1,,9^9)/(a)}, "offset 1", )))))), "×"),
"select Col2,sum(Col3) where Col3>0 group by Col2 pivot Col1"))
(C2:INDEX(C:C, MAX(ROW(C:C)*(C:C<>"")))-B2:INDEX(B:B, MAX(ROW(B:B)*(B:B<>"")))+1,
B2:INDEX(B:B, MAX(ROW(B:B)*(B:B<>"")))-1))

Subtract Value at Aggregate by Quarter

Values are for two groups by quarter.
In DAX, need to summarize all the data but also need to remove -3 from each quarter in 2021 for Group 1, without allowing the value to go below 0.
This only impacts:
Group 1 Only
2021 Only
However, I also need to retain the data details without the adjustment. So I can't do this in Power Query. My data detail is actually in months but I'm only listing one date per quarter for brevity.
Data:
Group
Date
Value
1
01/01/2020
10
1
04/01/2020
8
1
07/01/2020
18
1
10/01/2020
2
1
01/01/2021
12
1
04/01/2021
3
1
07/01/2021
7
1
10/01/2021
2
2
01/01/2020
10
2
04/01/2020
8
2
07/01/2020
18
2
10/01/2020
2
2
01/01/2021
12
2
04/01/2021
3
2
07/01/2021
7
2
10/01/2021
2
Result:
Group
Qtr/Year
Value
1
Q1-2020
10
1
Q2-2020
8
1
Q3-2020
18
1
Q4-2020
2
1
2020
38
1
Q1-2021
9
1
Q2-2021
0
1
Q3-2021
4
1
Q4-2021
0
1
2021
13
2
Q1-2020
10
2
Q2-2020
8
2
Q3-2020
18
2
Q4-2020
2
2
2020
2
2
Q1-2021
12
2
Q2-2021
3
2
Q3-2021
7
2
Q4-2021
2
2
2021
24
You issue can be solved by using Matrix Table, and also to add new column to process value before create the table:
First, add a new column using following formula:
Revised value =
var newValue = IF(YEAR(Sheet1[Date])=2021,Sheet1[Value]-3,Sheet1[Value])
return
IF(newValue <0,0,newValue)
Second, create the matrix table for the desired outcome:

Sum 5 rows at a time in an ordered SAS table with no unique identifier using proc sql

I'm working with a SAS table where I have ordered data that I need to sum in intervals of 5. I don't have a unique ID I can use for the group by statement and I'm struggling to find a solution.
Say I have this table
Number Name X Y
1 Susan 2 1
2 Susan 3 3
3 Susan 3 3
4 Susan 4 1
5 Susan 1 2
6 Susan 1 1
7 Susan 1 1
8 Susan 2 4
9 Susan 1 5
10 Susan 4 2
1 Steve 2 4
2 Steve 2 3
3 Steve 1 2
4 Steve 3 5
5 Steve 1 1
6 Steve 1 3
7 Steve 2 3
8 Steve 2 4
9 Steve 1 1
10 Steve 1 1
I'd want the output to look like
Number Name X Y
1-5 Susan 13 10
6-10 Susan 9 13
1-5 Steve 9 15
6-10 Steve 7 12
Is there an easy way to get output like this using proc sql? Thanks!
Try this:
proc sql;
select ceil(Number/5) as Grouping, Name, sum(X), sum(Y)
from have
group by Name, Grouping;
quit;

Broadcasting pandas dataframe to two dimensional matrix

I have a dataframe of size 12x24 which appears as (I have displayed truncated version):
HE 1 2 3 4 5
0 1 1.8 2.5 3.5 8.5
1 2 2.6 2.9 4.3 8.7
2 3 4.4 2.3 5.3 4.3
3 4 2.6 2.1 4.2 5.3
The column names are number 1 through 12 (representing months) and rows are numbered 1 through 24 (representing hours).
I have another DateTable dataframe which has data as follows:
Date Month Hour
2001-01-01 1 1
2001-02-01 2 4
2001-01-05 1 3
2011-01-31 3 2
2012-01-01 1 5
I want to broadcast the values from 12x24 array into the DateTable to get the following:
Date Month Hour Values
2001-01-01 1 1 3.5
2001-02-01 2 4 5.3
2001-01-05 1 3 2.5
2011-01-31 3 2 2.6
2012-01-01 1 5 1.8
I envision creating some kind of multiindex from the 12x24 table and using against the DateTable but not quite sure about the syntax as I am struggling with syntax.

Stata: how to duplicate observations under certain conditions

Please help me duplicate a variable under certain conditions? My original dataset looks like this:
week category averageprice
1 1 5
1 2 6
2 1 4
2 2 7
This table says that for each week, there is a unique average price for each category of goods.
I need to create the following variables:
averageprice1 (av. price for category 1)
averageprice2 (av. price for category 2)
such that:
week category averageprice1 averageprice2
1 1 5 6
1 2 5 6
2 1 4 7
2 2 4 7
meaning that for week 1, average price for category 1 stayed at $5, and av. price for cater 2 stayed at 6. Similar logic applies to week 2.
As you could see that the new variables are duplicated depending on a week.
I am still learning Stata. I tried:
bysort week: replace averageprice1=averageprice if categ==1
but it doesn't work as expected.
You are not duplicating observations (meaning here in the Stata sense, i.e. cases or records) here at all, as (1) the number of observations remains the same (2) you are copying certain values, not the contents of observations. Similar comment on "duplicating variables". However, that's just loose use of terminology.
Taking your example very literally
clear
input week category averageprice
1 1 5
1 2 6
2 1 4
2 2 7
end
bysort week (category) : gen averageprice1 = averageprice[1]
by week: gen averageprice2 = averageprice[2]
l
+--------------------------------------------------+
| week category averag~e averag~1 averag~2 |
|--------------------------------------------------|
1. | 1 1 5 5 6 |
2. | 1 2 6 5 6 |
3. | 2 1 4 4 7 |
4. | 2 2 7 4 7 |
+--------------------------------------------------+
This is a standard application of subscripting with by:. Your code didn't work because it did not oblige Stata to look in other observations when that is needed. In fact your use of bysort week did not affect how the code applied at all.
EDIT:
A generalization is
egen averageprice1 = mean(averageprice / (category == 1)), by(week)
egen averageprice2 = mean(averageprice / (category == 2)), by(week)