DAX equation to average data with different timespans - powerbi

I have data for different companies. The data stops at day 10 for one of the companies (Company 1), day 6 for the others. If Company 1 is selected with other companies, I want to show the average so that the data runs until day 10, but using day 7, 8, 9, 10 values for Company 1 and day 6 values for others.
I'd want to just fill down days 8-10 for other companies with the day 6 value, but that would look misleading on the graph. So I need a DAX equation with some magic in it.
As an example, I have companies:
Company 1
Company 2
Company 3
etc. as a filter
And a table like:
Company
Date
Day of Month
Count
Company 1
1.11.2022
1
10
Company 1
2.11.2022
2
20
Company 1
3.11.2022
3
21
Company 1
4.11.2022
4
30
Company 1
5.11.2022
5
40
Company 1
6.11.2022
6
50
Company 1
7.11.2022
7
55
Company 1
8.11.2022
8
60
Company 1
9.11.2022
9
62
Company 1
10.11.2022
10
70
Company 1
11.11.2022
11
NULL
Company 2
1.11.2022
1
15
Company 2
2.11.2022
2
25
Company 2
3.11.2022
3
30
Company 2
4.11.2022
4
34
Company 2
5.11.2022
5
45
Company 2
6.11.2022
6
100
Company 2
7.11.2022
7
NULL
Every date has a row, but for days over 6/10 the count is NULL. If Company 1 or Company 2 is chosen separately, I'd like to show the count as is. If they are chosen together, I'd like the average of the two so that:
Day 5: AVG(40,45)
Day 6: AVG(50,100)
Day 7: AVG(55,100)
Day 8: AVG(60,100)
Day 9: AVG(62,100)
Day 10: AVG(70,100)
Any ideas?

You want something like this?
Create a Matriz using your:
company_table_dim (M)
calendar_Days_Table(N)
So you will have a new table of MXN Rows
Go to PowerQuery Order DATA and FillDown your QTY column
(= Table.FillDown(#"Se expandió Fact_Table",{"QTY"}))
So your last known QTY will de filled til the end of Time_Table for any company filters
Cons: Consider your new Matriz MXN it could be millions of rows to calculate
Greetings
enter image description here

Related

Google Sheets formula for summing/averaging with specific conditions

I am hoping for a formula to take hours from the name columns and sum/average them by week, into a separate table like the 2nd one below. The formulas need to update upon changing the start and end week cells.
Body Part
Start Week
End Week
Arnold (hours)
Usain (hours)
Bob (hours)
Arms
1
3
6
3
0
Legs
1
6
12
36
20
Chest
2
4
6
2
2
Booty
4
6
9
12
3
Core
1
5
10
5
5
Formula Needed:
Hours
Arnold
Usian
Bob
Week 1
6
8
4.33
Week 2
8
8.67
5
Week 3
8
8.67
5
Week 4
9
11.67
6
Week 5
7
11
5.33
Week 6
5
10
4.33
Bonus if there is a way to also quickly average hours by body parts if for example there are multiple Arms rows.
try:
=ARRAYFORMULA(LAMBDA(a, b, QUERY(SPLIT(FLATTEN(BYCOL(D1:F1, LAMBDA(xx, FLATTEN(IF(
IF(a>=SEQUENCE(1, MAX(a)), "Week "&TEXT(SEQUENCE(1, MAX(a))+b, "00"), )="",,
REGEXEXTRACT(OFFSET(xx,,,1), "(.+) \(")&"×"&
IF(a>=SEQUENCE(1, MAX(a)), "Week "&TEXT(SEQUENCE(1, MAX(a))+b, "00"), )&"×"&
QUERY({REGEXEXTRACT(OFFSET(xx,,,1), "(.+) \("); OFFSET(xx,1,,9^9)/(a)}, "offset 1", )))))), "×"),
"select Col2,sum(Col3) where Col3>0 group by Col2 pivot Col1"))
(C2:INDEX(C:C, MAX(ROW(C:C)*(C:C<>"")))-B2:INDEX(B:B, MAX(ROW(B:B)*(B:B<>"")))+1,
B2:INDEX(B:B, MAX(ROW(B:B)*(B:B<>"")))-1))

PowerBI running Total formula

I have a dataset OvertimeHours with EMPLID, checkdate and NumberOfHours (and other fields). I need a running total NumberOfHours for each employee by checkdate. I tried using the Quick Measure option but that only allows for a single column and I have two. I do not want the measure to recalculate when filters are applied. Ultimately what I am trying to do is identify the records for the first 6 hours of overtime worked on each check so that they can get a category of OCB and all overtime over the first 6 hours is OTP and it does not have to be exact (as demonstrated in the output below). I have only been working with Power BI for about a month and this is a pretty complex (for me) formula to figure out...
EMPLID CheckDate WkDate NumberOfHours RunningTotal Category
124 1/1/19 12/20/18 5 5 OCB
124 1/1/19 12/21/18 9 14 OTP
125 1/1/19 12/20/18 3 3 OCB
125 1/1/19 12/20/18 2 5 OCB
125 1/1/19 12/22/18 2 7 OTP
124 1/15/19 1/8/19 3 3 OCB
*Edited to add the WkDate.
Edit:
I have tweaked my query so that I have the running total and a sequential counter now:
Using the first 12 records, I am looking to get the following results:
I can either do it in a query if that is the easiest way or if there is a way to use DAX in PowerBI with this dataset now that I have the sequential piece, I can do that too.
I got it in the query:
select r.CheckDate,
r.EMPLID,
case
when PayrollRunningOTHours <= 6
then PayrollRunningOTHours
else 6
end as OCBHours,
case
when PayRollRunningOTHours > 6
then PayRollRunningOTHours - 6
end as OTPHours
from #rollingtotal r
inner
join lastone l
on r.CheckDate = l.CheckDate
and r.EMPLID = l.EMPLID
and r.OTCounter = l.lastRec
order by r.emplid,
r.CheckDate,
r.OTCounter

Changing ID from nth to last row if something happens at nth row

My data has some problem. The survey is conducted on housing unit. So the two rows with the same person ID might not actually indicate the same person.
I want to assign different ID for actually different person.
Let's say I have this data.
id yearmonth age
1 200001 12
1 200002 12
1 200003 14
1 200004 14
1 200005 14
3rd row is definitely different person. Its age increase by 2.
So I want to change ID like
id yearmonth age
1 200001 12
1 200002 12
10 200003 14
10 200004 14
10 200005 14
How can I do this? I think I can change the ID of 3rd row by writing
bysort id (yearmonth): replace id=id*10 if age[_n-1]>age+1 | age[_n-1]+1<age
(where I multiply by 10 because all IDs have the same number of numbers, so that multiplying by 10 won't give any duplicate)
But how can I change all subsequent rows?
Building on what you have, something like this might do what you want.
bysort id (yearmonth): generate idchange = age[_n-1]>age+1 | age[_n-1]+1<age
bysort id (yearmonth): generate numchange = sum(idchange)
replace id = 10*id + (idchange-1) if idchange>0
Note that this will handle the case where one original id has two or more changes detected. For up to 10 changes, anyhow.
id yearmonth age
2 200001 12
2 200002 14
2 200003 15
2 200004 18
2 200005 18

Adding column based on ID in another data

data1 is data from 1990 and it looks like
Panelkey Region income
1 9 30
2 1 20
4 2 40
data2 is data from 2000 and it looks like
Panelkey Region income
3 2 40
2 1 30
1 1 20
I want to add a column of where each person lived in 1990.
Panelkey Region income Region1990
3 2 40 .
2 1 30 1
1 1 20 9
How can I do this on Stata?
The following code will deal with panels that live in multiple regions in the same year by choosing the region with larger income. This would make sense if income was proportional to fraction of the year spent in a region. Same income ties will be broken arbitrarily using the highest region's value. Other types of aggregation might make sense (take a look at the -collapse- command).
Note that I tweaked your data by inserting second rows for the last observation in each year:
clear
input Panelkey Region income
1 9 30
2 1 20
4 2 40
4 10 80
end
rename (Region income) =1990
bysort Panelkey (income Region): keep if _n==_N
isid Panelkey
save "data1990.dta", replace
clear
input Panelkey Region income
3 2 40
2 1 30
1 1 20
1 9 20
end
bysort Panelkey (income Region): keep if _n==_N
isid Panelkey
merge 1:1 Panelkey using "data1990.dta", keep(match master) nogen
list, clean noobs

Django query aggregation

Imagine a number guessing game where one person thinks of a number and another person has to guess it. The game is over if the correct number was guessed.
The models might look like this
class SecretNumber(models.Model):
number = models.IntegerField()
class Guess(models.Model)
secretnumber = models.Foreignkey(SecretNumber)
guess = models.IntegerField()
After having played four times, the database might look like this:
id number
==========
1 10
2 54
3 68
4 25
id secretnumber_id guess
=============================
1 1 50
2 1 30
3 1 10
4 2 99
5 2 60
6 2 54
7 3 1
8 3 68
9 4 73
10 4 34
11 4 86
12 4 51
13 4 25
As you can see, the guesser was very lucky: it took him 3, 3, 2 and 4 guesses. But that's just to keep this example short.
Now I need to come up with a query which will allow to display the following data:
Nb. guesses Count
=====================
2 1
3 2
4 1
A manual SQL statement would look something like this:
SELECT inner_count AS 'Nb. guesses', count(inner_count) AS 'Count' FROM (
SELECT secretnumber_id, count(id) AS inner_count FROM guess GROUP BY secretnumber_id
) GROUP BY inner_count
I thought about annotating an annotation, but this seems not to be possible.
Any ideas?
If you're using django (ie models instead of classes), you want to use the QuerySet aggregate functions
e.g.
from django.db.models import Count
guesses = Guess.objects.values('secretnumber').annotate(Count('secretnumber'))
This will give you a queryset with a list of objects, which have a secretnumber and a count value.