How to sum by group in Power Query Editor? - powerbi

My table look like this :
Serial WO# Value Indicator
A 333 10 333-1
A 333 4 333-2
B 456 5 456-1
A 334 1 334-1
A 334 5 334-2
I want to create a new column that sums up the Values based on WO#. It should look like this:
Serial WO# Value Indicator SumValue
A 333 10 333-1 14
A 333 4 333-2 14
B 456 5 456-1 5
A 334 1 334-1 6
A 334 5 334-2 6
Eventually I will remove duplicates on the WO# and remove the Value and Indicator Columns from the data. I can't seem to find a function in M that allows for sum by group. Thanks in advance!

If you load the data with Power Query, there is a Group command on the ribbon that will do just that.
Make sure to use the Advanced option and add all columns you want to retain to the grouping section. Screenshot from Excel ....
.... and from Power BI

Related

Flag everytime when ID change date DAX

I have table where with orders, articles belonging to orders and their shipping dates. What I want to do is, flag every time when shipping date changed or (when all dates for OrderID are the same) flag only once.
I tried to use calculated columns wrote in DAX, like nextdate, prevdate, nextorder, prevorder and reffer to them, but it doesn't work
I would appreciate every tip how to solve my prblem. Thanks!
OrderID
Article ID
Shipping date
Flag
123
1
01.01.2012
1
123
2
01.01.2012
0
123
1
02.01.2012
1
1234
12
15.03.2012
1
678
12
25.05.2014
1
678
345
25.05.2014
0
678
567
25.05.2014
0

PowerBI running Total formula

I have a dataset OvertimeHours with EMPLID, checkdate and NumberOfHours (and other fields). I need a running total NumberOfHours for each employee by checkdate. I tried using the Quick Measure option but that only allows for a single column and I have two. I do not want the measure to recalculate when filters are applied. Ultimately what I am trying to do is identify the records for the first 6 hours of overtime worked on each check so that they can get a category of OCB and all overtime over the first 6 hours is OTP and it does not have to be exact (as demonstrated in the output below). I have only been working with Power BI for about a month and this is a pretty complex (for me) formula to figure out...
EMPLID CheckDate WkDate NumberOfHours RunningTotal Category
124 1/1/19 12/20/18 5 5 OCB
124 1/1/19 12/21/18 9 14 OTP
125 1/1/19 12/20/18 3 3 OCB
125 1/1/19 12/20/18 2 5 OCB
125 1/1/19 12/22/18 2 7 OTP
124 1/15/19 1/8/19 3 3 OCB
*Edited to add the WkDate.
Edit:
I have tweaked my query so that I have the running total and a sequential counter now:
Using the first 12 records, I am looking to get the following results:
I can either do it in a query if that is the easiest way or if there is a way to use DAX in PowerBI with this dataset now that I have the sequential piece, I can do that too.
I got it in the query:
select r.CheckDate,
r.EMPLID,
case
when PayrollRunningOTHours <= 6
then PayrollRunningOTHours
else 6
end as OCBHours,
case
when PayRollRunningOTHours > 6
then PayRollRunningOTHours - 6
end as OTPHours
from #rollingtotal r
inner
join lastone l
on r.CheckDate = l.CheckDate
and r.EMPLID = l.EMPLID
and r.OTCounter = l.lastRec
order by r.emplid,
r.CheckDate,
r.OTCounter

How should I write multiple IF statements in DAX using Power BI Desktop?

On Power BI Desktop, I am working with multiple conditional IF statements. I have an original table with user IDs and SecondsToOrder, looking like this.
UserID SecondsToOrder
00001 2320
00002 13
00003 389
00004 95
... ...
I created a new calculated column MinutesRounded to rounddown seconds into minutes, and now my table is looking like this.
UserID SecondsToOrder MinutesRounded
00001 2320 38
00002 13 0
00003 389 12
00004 95 1
... ... ...
Now I want to create another column based on my calculated column MinutesRounded, where depending on a number I assign each user to one of the following groups: '< 1 minute' '<15 minutes' and '> 15 minutes'. The end result should look like this.
UserID SecondsToOrder MinutesRounded Lenght
00001 2320 38 > 15 minutes
00002 13 0 < 1 minute
00003 389 12 < 15 minutes
00004 95 1 < 1 minute
... ... ... ...
I am doing it using DAX by this statement.
Lenght = IF([MinutesRounded]<1,"< 1 minute",IF([MinutesRounded]<15,"<15 minutes", "> 15 minutes"))
And getting a syntax error. Seriously don't understand what is wrong here. Could you please help. The error I am getting is below:
The syntax for '"< 1 minute"' is incorrect. (DAX(IF([MinutesRounded]<1."< 1 minute",IF([MinutesRounded]<15."<15 minutes", "> 15 minutes")))).
For some reason I see dots and brackets appearning in the error which I haven't even typed. How should I fix it?
UPDATE: found the reason was some regional/keyboard setting within POWER BI, and that's why I had to use semicolons instead of commas. The code itself was correct.
I get no error using your DAX exactly as it is:
= IF([MinutesRounded]<1,"< 1 minute",IF([MinutesRounded]<15,"<15 minutes", "> 15 minutes"))
You can also use SWITCH:
=
SWITCH (
TRUE (),
[MinutesRounded] < 1, "< 1 minute",
[MinutesRounded] < 15, "<15 minutes",
"> 15 minutes"
)
Thanks

replace multiple column values at the same time

I would like to replace multiple column values at the same time in a dataframe. I would like to change 2 to 1, 1 to 2.
data=data.frmae(store=c(122,323,254,435,654,342,234,344)
,cluster=c(2,2,2,1,1,3,3,3))
The problem in my code is after it changes 2 to 1 , it changes these 1's to 2.
Can I do it in dplyr or sth? Thank you
Desired data set below
store cluster
122 1
323 1
254 1
435 2
654 2
342 3
234 3
344 3

How to append a new column to my Pandas DataFrame based on a row-based calculation?

Let's say I have a Pandas DataFrame with two columns: 1) user_id, 2) steps (which contains the number of steps on the given date). Now I want to calculate the difference between the number of steps and the number of steps in the preceding measurement (measurements are guaranteed to be in order within my DataFrame).
So basically this comes down to appending an extra column to my DataFrame where the row values of this data frame match the value of the column 'steps' within this same row, minus the value of the 'steps' column in the row above (or 0 if this is the first row). To complicate things further, I want to calculate these differences per user_id, so I want to make sure that I do not subtract the steps values of two rows with different user_id's.
Does anyone have an idea how to get this done with Python 2.7 and Panda?
So an example to illustrate this.
Example input:
user_id steps
1015 48
1015 23
1015 79
1016 10
1016 20
Desired output:
user_id steps d_steps
1015 48 0
1015 23 -25
1015 79 56
2023 10 0
2023 20 10
Your output shows user ids that are not in you orig data but the following does what you want, you will have to replace/fill the NaN values with 0:
In [16]:
df['d_steps'] = df.groupby('user_id').transform('diff')
df.fillna(0, inplace=True)
df
Out[16]:
user_id steps d_steps
0 1015 48 0
1 1015 23 -25
2 1015 79 56
3 1016 10 0
4 1016 20 10
Here we generate the desired column by calling transform on the groupby by object and pass a string which maps to the diff method which subtracts the previous row value. Transform applies a function and returns a series with an index aligned to the df.