I want to replace the blank value with maximum previous value where "DepositLclCcy" is not blank, i.e. 5 in this case for SEK.
How do I write this formula?
Current Result
PositionDate Currency DepositLclCcy
2017-04-11 SEK 1
2017-04-11 DKK 3
2017-04-11 EUR 7
2017-04-10 SEK (blank)
2017-04-10 DKK 3
2017-04-10 EUR 5
2017-04-07 SEK 5
2017-04-07 DKK 3
2017-04-07 EUR 5
Desired Result
PositionDate Currency DepositLclCcy
2017-04-11 SEK 1
2017-04-11 DKK 3
2017-04-11 EUR 7
2017-04-10 SEK 5
2017-04-10 DKK 3
2017-04-10 EUR 5
2017-04-07 SEK 5
2017-04-07 DKK 3
2017-04-07 EUR 5
You can use the following DAX to create the calculated column:
ReplacedDepositLclCcy =
IF(
ISBLANK('Table'[DepositLclCcy]),
CALCULATE(
FIRSTNONBLANK('Table'[DepositLclCcy], 0),
FILTER(
'Table',
'Table'[PositionDate] < EARLIER('Table'[PositionDate]) &&
'Table'[Currency] = EARLIER('Table'[Currency]) &&
'Table'[PositionDate] = LASTDATE('Table'[PositionDate]) &&
NOT(ISBLANK('Table'[DepositLclCcy]))
)
),
'Table'[DepositLclCcy]
)
So it basically filters the rows with the max PositionDate which has the same Currency and non-blank DepositLclCcy, and return the DepositLclCcy. It also works for cases where there are consecutive blank values.
Depending on the data type of DepositLclCcy you may need to change ISBLANK('Table'[DepositLclCcy]) to 'Table'[DepositLclCcy] = "".
Results:
Related
I have a table 'fact WorkItems', where changes made to items are saved on the day of the change ([ChangedDate]).
WorkItemId
AreaPath
ChangedDate
State
StoryPoints
IterationPath
1day
14day
Commitment
Delivered
1
A
04/01/2022
New
4
24.1
4
1
A
06/01/2022
Ready
6
24.1
6
1
A
10/01/2022
Active
6
24.1
6
1
A
12/01/2022
Testing
8
24.1
8
8
1
A
18/01/2022
Testing
2
24.1
1
A
19/01/2022
Testing
2
24.1
1
A
28/01/2022
Closed
2
24.1
2
1
A
01/02/2022
Closed
2
24.1
2
2
1
A
03/02/2022
Closed
2
24.1
I also have a table called 'dim Itrations', where cycles' start and end dates are kept.
IterationPath
StartDate
EndDate
24.1
13/01/2022
02/02/2022
24.2
03/02/2022
17/02/2022
24.3
18/02/2022
04/03/2022
I have created a measure [1day] which tells me how many [StoryPoints] had an Item on or before a cycle started
1day =
CALCULATE (
SUM( 'fact WorkItems'[StoryPoints] ),
FILTER (
'dim Iterations',
MAX('fact Workitems'[ChangedDate]) <= 'dim Iterations'[StartDate]
)
)
I have also created a measure [14day] which tells me how many [StoryPoints] had an Item by the end of the cycle once it's in "Closed" [State].
14day =
CALCULATE (
SUM( 'fact WorkItems'[StoryPoints] ),
FILTER ('fact WorkItems',
AND(
'fact Workitems'[State]="Closed",
'fact Workitems'[ChangedDate] <= MIN('dim Iterations'[EndDate])
)
))
My objective is to create two measures: [Commitment] and [Delivered] which will sum all the StoryPoints each team ([AreaPath]) had for each of the cycles exactly on the the day the cycle started and ended (the latter with status closed). I know how to "tell" Power BI to only look at the rows, where Change date = [StartDate] or [ClosedDate].
The problem is when changes to items are made before the start/end date. I'm trying to "fill in those gaps" in the table by telling Power BI to look at the last record which was changed before or on the Start/End Date and to use those values. What I'm trying to achieve is presented in the table. Any ideas?
I have two tables.
A campagin table:
Campaign ID
Start Date
End Date
Daily Target
1
21/12/2020
15/02/2021
5
2
18/10/2020
18/01/2021
3
3
01/07/2020
03/01/2021
8
4
09/01/2021
15/05/2021
1
5
05/08/2020
09/01/2021
2
And a simple Date table:
Date
01/01/2021
02/01/2021
03/01/2021
04/01/2021
05/01/2021
06/01/2021
07/01/2021
08/01/2021
09/01/2021
10/01/2021
11/01/2021
12/01/2021
13/01/2021
What I would like to do is add a calculated column to the Date table that will calculate the sum of all the Daily Targets for campaigns that are between Start Date and End Date. So for 1st January 2021 I want to take the sum of the Daily Targets for Campaign 1, 2, 3 & 5. E.g:
Date
Total Daily Target
01/01/2021
18
02/01/2021
18
03/01/2021
18
04/01/2021
10
05/01/2021
10
06/01/2021
10
07/01/2021
10
08/01/2021
10
09/01/2021
9
10/01/2021
9
11/01/2021
9
12/01/2021
9
13/01/2021
9
I'm quite new to DAX and have tried multiple different variations of SUM(), SUMX() & FILTER() within CALCULATE(), all to no avail. I also don't know what the relationship between the two tables should be seeing as there are two dates in the Campaign table? Any help at all would be greatly appreciated.
Try this below Measure-
Measure =
var current_row_date = MIN('date'[Date])
RETURN
CALCULATE(
SUM(campaign[Daily Target]),
campaign[Start Date] <= current_row_date
&& campaign[End Date] >= current_row_date
)
output-
Using the information below I need to create a new table in DAX called Table (Download a demo file here).
I need to find the location of each employee (column "Name") at the time of the sale date in column "Sale Date" based on their contract details in table DbEmployees. If there is more than one valid contract for a given employee that the sale date fits in, use the shortest contract length.
My problem is that the below measure isn't working to generate column "Location", but it works just fine for column "new value".
Why is this happening and how can it be fixed?
Expected result:
SaleID
EmployeeID
Sale Date
new value
Name
Location
1
45643213
2021-02-04
89067445
Sally Shore
4
2
57647868
2020-04-15
57647868
Paul Bunyon
3
3
89067445
2019-09-24
57647868
Paul Bunyon
6
DbEmployees:
ID
Name
StartDate
EndDate
Location
Position
546465546
Sandra Newman
2021/01/01
2021/12/31
1
Manager
546465546
Sandra Newman
2020/01/01
2020/12/31
2
Clerk
546465546
Sandra Newman
2019/01/01
2019/12/31
3
Clerk
545365743
Paul Bunyon
2021/01/01
2021/12/31
6
Manager
545365743
Paul Bunyon
2020/04/01
2020/05/01
3
Clerk
545365743
Paul Bunyon
2019/04/01
2021/01/01
6
Manager
796423504
Sally Shore
2020/01/01
2020/12/31
4
Clerk
783546053
Jack Tomson
2019/01/01
2019/12/31
2
Manager
DynamicsSales:
SaleID
EmployeeID
Sale Date
1
45643213
2021/02/04
2
57647868
2020/04/15
3
89067445
2019/09/24
DynamicsContacts:
EmployeeID
Name
Email
45643213
Sandra Newman
sandra.newman#hotmail.com
65437658
Jack Tomson
jack.tomson#hotmail.com
57647868
Paul Bunyon
paul.bunyon#hotmail.com
89067445
Sally Shore
sally.shore#hotmail.com
DynamicsAudit:
SaleID
Changed Date
old value
new value
AuditID
Valid Until
1
2019/06/08
65437658
57647868
1
2020-06-07
1
2020/06/07
57647868
89067445
2
2021-05-07
1
2021/05/07
89067445
45643213
3
2021-05-07
2
2019/06/08
65437658
57647868
4
2020-06-07
2
2020/06/07
57647868
89067445
5
2021-05-07
2
2021/05/07
89067445
45643213
6
2021-05-07
3
2019/06/08
65437658
57647868
7
2020-06-07
3
2020/06/07
57647868
89067445
8
2021-05-07
3
2021/05/07
89067445
45643213
9
2021-05-07
From what I can see there are a couple of issues with your formula.
First of all there is no relationship between Table and DbEmployees so when you are filtering exclusively on the dates, which might get you the wrong Location. This can be fixed by changing the formula to:
Location =
VAR CurrentContractDate = [Sale Date]
VAR empName = [Name]
RETURN
VAR RespLocation =
TOPN (
1,
FILTER(DbEmployees, DbEmployees[Name] = empName),
IF (
.....
Secondly, you need to remember that the TOPN function can return multiple rows, from the documentation:
If there is a tie, in order_by values, at the N-th row of the table, then all tied rows are returned. Then, when there are ties at the N-th row the function might return more than n rows.
This can be fixed by picking the Max/Min of the result in the table:
RETURN MAXX(SELECTCOLUMNS( RespLocation,"Location", [Location] ), [Location])
Finally, I don't understand why the last row on the expected result should be a 3, given that the sale date is within a record with location 6.
Full expression:
Location =
VAR CurrentContractDate = [Sale Date]
VAR empName = [Name]
RETURN
VAR RespLocation =
TOPN (
1,
FILTER(DbEmployees, DbEmployees[Name] = empName),
IF (
CurrentContractDate <= DbEmployees[EndDate]
&& CurrentContractDate >= DbEmployees[StartDate], //Check, whether there is matching date
DATEDIFF ( DbEmployees[StartDate], DbEmployees[EndDate], DAY ), //If so, rank matching locations (you may want to employ a different formula)
MIN ( //If the location is not matching, calculate how close it is (from both start and end date)
ABS ( DATEDIFF ( CurrentContractDate, DbEmployees[StartDate], DAY ) ),
ABS ( DATEDIFF ( CurrentContractDate, DbEmployees[EndDate], DAY ) )
) + 1000000 //Add a discriminating factor in case there are matching rows that should be favoured over non-matching.
), 1
)
RETURN
MAXX(SELECTCOLUMNS( RespLocation,"Location", [Location] ), [Location])
Importing the data frame
df = pd.read_csv("C:\\Users")
Printing the list of employees usernames
print (df['AssignedTo'])
Returns:
Out[4]:
0 vaughad
1 channln
2 stalasi
3 mitras
4 martil
5 erict
6 erict
7 channln
8 saia
9 channln
10 roedema
11 vaughad
Printing The Dates
Returns:
Out[6]:
0 2015-11-05
1 2016-05-27
2 2016-04-26
3 2016-02-18
4 2016-02-18
5 2015-11-02
6 2016-01-14
7 2015-12-15
8 2015-12-31
9 2015-10-16
10 2016-01-07
11 2015-11-20
Now I need to collect the latest date per employee?
I have tried:
MaxDate = max(df.FilledEnd)
But this just returns one date for all employees.
So we see multiple employees in the data set with different dates, in a new column named "LatestDate" I need the latest date that corresponds to the employee, so for "vaughad" in a new column it would return "2015-11-20" for all of "vaughad" records and in the same column for username "channln" it would return "2016-5-27" for all of "channln" latest dates.
You need to group your data first, using DataFrame.groupby(), after which you can produce aggregate values, like the maximum date in the FilledEnd series:
df.groupby('AssignedTo')['FilledEnd'].max()
This produces a series, with AssignedTo as the index, and the latest date for each of those employees as the values:
>>> df.groupby('AssignedTo')['FilledEnd'].max()
AssignedTo
channln 2016-05-27
erict 2016-01-14
martil 2016-02-18
mitras 2016-02-18
roedema 2016-01-07
saia 2015-12-31
stalasi 2016-04-26
vaughad 2015-11-20
Name: FilledEnd, dtype: object
If you wanted to add those max dates values back to the dataframe, use groupby(...).transform() with numpy.max instead, so you get a series with the same indices:
df['MaxDate'] = df.groupby('AssignedTo')['FilledEnd'].transform(np.max)
This adds in a MaxDate column:
AssignedTo FilledEnd MaxDate
0 vaughad 2015-11-05 2015-11-20
1 channln 2016-05-27 2016-05-27
2 stalasi 2016-04-26 2016-04-26
3 mitras 2016-02-18 2016-02-18
4 martil 2016-02-18 2016-02-18
5 erict 2015-11-02 2016-01-14
6 erict 2016-01-14 2016-01-14
7 channln 2015-12-15 2016-05-27
8 saia 2015-12-31 2015-12-31
9 channln 2015-10-16 2016-05-27
10 roedema 2016-01-07 2016-01-07
11 vaughad 2015-11-20 2015-11-20
I have a dataframe as given below:
Index Date Country Occurence
0 2013-12-30 US 1
1 2013-12-30 India 3
2 2014-01-10 US 1
3 2014-01-15 India 1
4 2014-02-05 UK 5
I want to convert daily data into weekly,grouped by anatomy,method being sum.
Itried resampling,but the output gave Multi Index data frame from which i was not able to access "Country" and "Date" columns(pls refer above)
The desired output is given below:
Date Country Occurence
Week1 India 4
Week2
Week1 US 2
Week2
Week5 Germany 5
You can groupby on country and resample on week
In [63]: df
Out[63]:
Date Country Occurence
0 2013-12-30 US 1
1 2013-12-30 India 3
2 2014-01-10 US 1
3 2014-01-15 India 1
4 2014-02-05 UK 5
In [64]: df.set_index('Date').groupby('Country').resample('W', how='sum')
Out[64]:
Occurence
Country Date
India 2014-01-05 3
2014-01-12 NaN
2014-01-19 1
UK 2014-02-09 5
US 2014-01-05 1
2014-01-12 1
And, you could use reset_index()
In [65]: df.set_index('Date').groupby('Country').resample('W', how='sum').reset_index()
Out[65]:
Country Date Occurence
0 India 2014-01-05 3
1 India 2014-01-12 NaN
2 India 2014-01-19 1
3 UK 2014-02-09 5
4 US 2014-01-05 1
5 US 2014-01-12 1