How can I create sub totals for multiple levels of groups in List & Label - grouping

Using List and Label, is it possible to create multiple levels of groups with subtotals?
For example, if my table has the following columns:
Currency
Transaction Date
Transaction Type
Transaction Identifier
Amount
Can I create a report with multiple levels of grouping (e.g. Currency, Transaction Date & Transaction Type)?
I.e. a report which shows:
Currency 1 Transaction Date 1 Transaction Type 1 Transaction Identifier 1 250
Currency 1 Transaction Date 1 Transaction Type 1 Transaction Identifier 2 (300)
Total for Currency 1, Transaction Date 1, Transaction Type 1 (50)
Currency 1 Transaction Date 1 Transaction Type 2 Transaction Identifier 3 100
Total for Currency 1, Transaction Date 1, Transaction Type 2 100
Currency 1 Transaction Date 1 Transaction Type 4 Transaction Identifier 4 125
Currency 1 Transaction Date 1 Transaction Type 4 Transaction Identifier 5 (500)
Total for Currency 1, Transaction Date 1, Transaction Type 4 (375)
Total for Currency 1, Transaction Date 1 (325)
Currency 1 Transaction Date 2 Transaction Type 1 Transaction Identifier 6 (75)
Currency 1 Transaction Date 2 Transaction Type 1 Transaction Identifier 7 600
Currency 1 Transaction Date 2 Transaction Type 1 Transaction Identifier 8 400
Total for Currency 1, Transaction Date 2, Transaction Type 1 925
Currency 1 Transaction Date 2 Transaction Type 2 Transaction Identifier 9 100
Currency 1 Transaction Date 2 Transaction Type 2 Transaction Identifier 10 25
Currency 1 Transaction Date 2 Transaction Type 2 Transaction Identifier 11 (50)
Currency 1 Transaction Date 2 Transaction Type 2 Transaction Identifier 12 (100)
Currency 1 Transaction Date 2 Transaction Type 2 Transaction Identifier 13 100
Total for Currency 1, Transaction Date 2, Transaction Type 2 75
Currency 1 Transaction Date 2 Transaction Type 3 Transaction Identifier 14 200
Currency 1 Transaction Date 2 Transaction Type 3 Transaction Identifier 15 800
Currency 1 Transaction Date 2 Transaction Type 3 Transaction Identifier 16 100
Total for Currency 1, Transaction Date 2, Transaction Type 3 1,100
Total for Currency 1, Transaction Date 2 2,100
Currency 1 Transaction Date 3 Transaction Type 1 Transaction Identifier 17 (50)
Currency 1 Transaction Date 3 Transaction Type 1 Transaction Identifier 18 1,000
Currency 1 Transaction Date 3 Transaction Type 1 Transaction Identifier 19 350
Total for Currency 1, Transaction Date 3, Transaction Type 1 1,300
Currency 1 Transaction Date 3 Transaction Type 5 Transaction Identifier 20 75
Total for Currency 1, Transaction Date 3, Transaction Type 5 75
Total for Currency 1, Transaction Date 3 1,375
Total for Currency 1 3,150
Currency 2 Transaction Date 1 Transaction Type 1 Transaction Identifier 21 (75)
Currency 2 Transaction Date 1 Transaction Type 1 Transaction Identifier 22 600
Currency 2 Transaction Date 1 Transaction Type 1 Transaction Identifier 23 800
Currency 2 Transaction Date 1 Transaction Type 1 Transaction Identifier 24 (50)
Currency 2 Transaction Date 1 Transaction Type 1 Transaction Identifier 25 250
Currency 2 Transaction Date 1 Transaction Type 1 Transaction Identifier 26 350
Currency 2 Transaction Date 1 Transaction Type 1 Transaction Identifier 27 (300)
Total for Currency 2, Transaction Date 1, Transaction Type 1 1,575
Currency 2 Transaction Date 1 Transaction Type 2 Transaction Identifier 28 100
Currency 2 Transaction Date 1 Transaction Type 2 Transaction Identifier 29 125
Total for Currency 2, Transaction Date 1, Transaction Type 2 225
Currency 2 Transaction Date 1 Transaction Type 3 Transaction Identifier 30 400
Total for Currency 2, Transaction Date 1, Transaction Type 3 400
Currency 2 Transaction Date 1 Transaction Type 4 Transaction Identifier 31 1,000
Total for Currency 2, Transaction Date 1, Transaction Type 4 1,000
Total for Currency 2, Transaction Date 1 3,200
Currency 2 Transaction Date 2 Transaction Type 2 Transaction Identifier 32 (50)
Currency 2 Transaction Date 2 Transaction Type 2 Transaction Identifier 33 (100)
Total for Currency 2, Transaction Date 2, Transaction Type 2 (150)
Currency 2 Transaction Date 2 Transaction Type 3 Transaction Identifier 34 100
Total for Currency 2, Transaction Date 2, Transaction Type 3 100
Currency 2 Transaction Date 2 Transaction Type 4 Transaction Identifier 35 25
Total for Currency 2, Transaction Date 2, Transaction Type 4 25
Total for Currency 2, Transaction Date 2 (25)
Currency 2 Transaction Date 3 Transaction Type 1 Transaction Identifier 36 100
Currency 2 Transaction Date 3 Transaction Type 1 Transaction Identifier 37 (500)
Total for Currency 2, Transaction Date 3, Transaction Type 1 (400)
Currency 2 Transaction Date 3 Transaction Type 2 Transaction Identifier 38 100
Currency 2 Transaction Date 3 Transaction Type 2 Transaction Identifier 39 75
Total for Currency 2, Transaction Date 3, Transaction Type 2 175
Currency 2 Transaction Date 3 Transaction Type 5 Transaction Identifier 40 200
Total for Currency 2, Transaction Date 3, Transaction Type 5 200
Total for Currency 2, Transaction Date 3 (25)
Total for Currency 2 3,150
Currency 3 Transaction Date 1 Transaction Type 1 Transaction Identifier 41 200
Currency 3 Transaction Date 1 Transaction Type 1 Transaction Identifier 42 (50)
Currency 3 Transaction Date 1 Transaction Type 1 Transaction Identifier 43 100
Currency 3 Transaction Date 1 Transaction Type 1 Transaction Identifier 44 (75)
Currency 3 Transaction Date 1 Transaction Type 1 Transaction Identifier 45 800
Currency 3 Transaction Date 1 Transaction Type 1 Transaction Identifier 46 (100)
Currency 3 Transaction Date 1 Transaction Type 1 Transaction Identifier 47 250
Currency 3 Transaction Date 1 Transaction Type 1 Transaction Identifier 48 400
Total for Currency 3, Transaction Date 1, Transaction Type 1 1,525
Currency 3 Transaction Date 1 Transaction Type 2 Transaction Identifier 49 1,000
Currency 3 Transaction Date 1 Transaction Type 2 Transaction Identifier 50 (300)
Total for Currency 3, Transaction Date 1, Transaction Type 2 700
Currency 3 Transaction Date 1 Transaction Type 3 Transaction Identifier 51 100
Total for Currency 3, Transaction Date 1, Transaction Type 3 100
Currency 3 Transaction Date 1 Transaction Type 5 Transaction Identifier 52 600
Total for Currency 3, Transaction Date 1, Transaction Type 5 600
Total for Currency 3, Transaction Date 1 2,925
Currency 3 Transaction Date 2 Transaction Type 1 Transaction Identifier 53 100
Currency 3 Transaction Date 2 Transaction Type 1 Transaction Identifier 54 25
Currency 3 Transaction Date 2 Transaction Type 1 Transaction Identifier 55 125
Currency 3 Transaction Date 2 Transaction Type 1 Transaction Identifier 56 350
Total for Currency 3, Transaction Date 2, Transaction Type 1 600
Currency 3 Transaction Date 2 Transaction Type 2 Transaction Identifier 57 (50)
Currency 3 Transaction Date 2 Transaction Type 2 Transaction Identifier 58 75
Total for Currency 3, Transaction Date 2, Transaction Type 2 25
Total for Currency 3, Transaction Date 2 625
Currency 3 Transaction Date 3 Transaction Type 1 Transaction Identifier 59 (500)
Total for Currency 3, Transaction Date 3, Transaction Type 1 (500)
Currency 3 Transaction Date 3 Transaction Type 2 Transaction Identifier 60 100
Total for Currency 3, Transaction Date 3, Transaction Type 2 100
Total for Currency 3, Transaction Date 3 (400)
Total for Currency 3 3,150

Sure, you can have an arbitrary number of group levels. Simply make sure your "Group By" string is taking the hierarchy into account. You'll have three group footers, the first has a "Group By" of ToString$(Currency), the second of ToString$(Currency)+ToString$(TransactionDate) and the third of ToString$(Currency)+ToString$(TransactionDate)+ToString$(TransactionType). Note that your data source needs to be sorted accordingly - your excerpt above looks fine in this respect.
For the subtotals, simply add three sum variables for the three levels and choose which one to reset here (it's the same on the group footer tab):
You'll also find a sample for hierarchical groupings in the sample application (Simple Lists > Item Report with Grouping)

Related

DAX equation to average data with different timespans

I have data for different companies. The data stops at day 10 for one of the companies (Company 1), day 6 for the others. If Company 1 is selected with other companies, I want to show the average so that the data runs until day 10, but using day 7, 8, 9, 10 values for Company 1 and day 6 values for others.
I'd want to just fill down days 8-10 for other companies with the day 6 value, but that would look misleading on the graph. So I need a DAX equation with some magic in it.
As an example, I have companies:
Company 1
Company 2
Company 3
etc. as a filter
And a table like:
Company
Date
Day of Month
Count
Company 1
1.11.2022
1
10
Company 1
2.11.2022
2
20
Company 1
3.11.2022
3
21
Company 1
4.11.2022
4
30
Company 1
5.11.2022
5
40
Company 1
6.11.2022
6
50
Company 1
7.11.2022
7
55
Company 1
8.11.2022
8
60
Company 1
9.11.2022
9
62
Company 1
10.11.2022
10
70
Company 1
11.11.2022
11
NULL
Company 2
1.11.2022
1
15
Company 2
2.11.2022
2
25
Company 2
3.11.2022
3
30
Company 2
4.11.2022
4
34
Company 2
5.11.2022
5
45
Company 2
6.11.2022
6
100
Company 2
7.11.2022
7
NULL
Every date has a row, but for days over 6/10 the count is NULL. If Company 1 or Company 2 is chosen separately, I'd like to show the count as is. If they are chosen together, I'd like the average of the two so that:
Day 5: AVG(40,45)
Day 6: AVG(50,100)
Day 7: AVG(55,100)
Day 8: AVG(60,100)
Day 9: AVG(62,100)
Day 10: AVG(70,100)
Any ideas?
You want something like this?
Create a Matriz using your:
company_table_dim (M)
calendar_Days_Table(N)
So you will have a new table of MXN Rows
Go to PowerQuery Order DATA and FillDown your QTY column
(= Table.FillDown(#"Se expandió Fact_Table",{"QTY"}))
So your last known QTY will de filled til the end of Time_Table for any company filters
Cons: Consider your new Matriz MXN it could be millions of rows to calculate
Greetings
enter image description here

How to create a boolean calculated field in Amazon QuickSight?

Let's assume I have access to this data in QuickSight :
Id Amount Date
1 10 15-01-2019
2 0 16-01-2020
3 100 21-12-2019
4 34 15-01-2020
5 5 20-02-2020
1 50 13-09-2020
4 0 01-01-2020
I would like to create a boolean calculated field, named "Amount_in_2020", whose value is True when the Id have a total strictly positive Amount in 2020, else False.
With python I would have done the following :
# Sample data
df = pd.DataFrame({'Id' : [1,2,3,4,5,1,4],
'Amount' : [10,0,100,34,5,50,0],
'Date' : ['15-01-2019','16-01-2020','21-12-2019','15-01-2020','20-02-2020','13-09-2020','01-01-2020']})
df['Date']=df['Date'].astype('datetime64')
# Group by to get total amount and filter dates
df_gb = pd.DataFrame(df[(df["Date"]>="01-01-2020") & (df["Date"]<="31-12-2020")].groupby(by=["Id"]).sum()["Amount"])
# Creation of the wanted column
df["Amount_in_2020"]=np.where(df["Id"].isin(list(df_gb[df_gb["Amount"]>0].index)),True,False)
But I can't find a way to create such a calculated field in Quicksight. Could you please help me ?
Expected output :
Id Amount Date Amount_in_2020
1 10 2019-01-15 True
2 0 2020-01-16 False
3 100 2019-12-21 False
4 34 2020-01-15 True
5 5 2020-02-20 True
1 50 2020-09-13 True
4 0 2020-01-01 True
Finally found :
ifelse(sumOver(max(ifelse(extract("YYYY",{Date})=2020,{Amount},0)), [{Id}])>0,1,0)

How to generate indicator if value of variable is observed in two different periods in Stata

I have a dataset containing various drugs and the dates they were supplied. I would like to create an indicator variable DIBP that takes a value of 1 if the same drug was supplied during both period 1 and period 2 of a given year, and zero otherwise. Period 1 is 1 April to 30 June, and period 2 is 1 October to 31 December.
I have written the following code:
. input id month day year str10 drug
id month day year drug
1. 1 5 1 2003 aspirin
2. 1 11 1 2003 aspirin
3. 1 6 1 2004 aspirin
4. 1 5 1 2005 aspirin
5. 1 11 1 2005 aspirin
6. end
.
. gen date = mdy(month,day,year)
. format date %d
.
. gen period = 1 if inlist(month,4,5,6)
(2 missing values generated)
. replace period = 2 if inlist(month,10,11,12)
(2 real changes made)
.
. label define plab 1"1 April to 30 June" 2"1 October to 31 December"
. label value period plab
.
. * Generate indicator
. gen DIBP = 0
. label var DIBP "Drug In Both Periods"
.
. bysort id year: replace DIBP = 1 if drug[period==1] == "aspirin" & drug[period==2] == "aspirin"
(0 real changes made)
.
. list
+---------------------------------------------------------------------------------+
| id month day year drug date period DIBP |
|---------------------------------------------------------------------------------|
1. | 1 5 1 2003 aspirin 01may2003 1 April to 30 June 0 |
2. | 1 11 1 2003 aspirin 01nov2003 1 October to 31 December 0 |
3. | 1 6 1 2004 aspirin 01jun2004 1 April to 30 June 0 |
4. | 1 5 1 2005 aspirin 01may2005 1 April to 30 June 0 |
5. | 1 11 1 2005 aspirin 01nov2005 1 October to 31 December 0 |
+---------------------------------------------------------------------------------+
I would expect DIBP to take a value of 1 for observations 1,2,3 and 4 (because they took aspirin during both periods for years 2003 and 2005) and a value of zero for observation 3 (because aspirin was only taken during one period in 2004), but this isn't the case. Where am I going wrong? Thank you.
There is a problem apparent with your use of subscripts. You seem to be assuming that a subscript can be used to select other observations, which can indeed be done individually. But what you tried is legal yet not what you want.
The expressions used as subscripts
period == 1
period == 2
will be evaluated as true (1) or false (0) according to the value of period in the current observation. Then either observation 0 (which is always regarded as having missing values) or observation 1 (the first in each group of observations) will be used. Otherwise put, subscripts evaluate as observation numbers, not as defining subsets of the data.
There is a further puzzle because even for the same person and year it seems that in principle period 1 or period 2 could mean several observations. In the example given, the drug is constant any way, but what would you expect the code to do if the drug was different? The crux most evident to me is distinguishing between a flag for any prescriptions of a certain drug and all prescriptions of that drug in a period. More at this FAQ.
Otherwise this code may help. Extension to several drugs is left as an exercise.
clear
input id month day year str10 drug
1 5 1 2003 aspirin
1 11 1 2003 aspirin
1 6 1 2004 aspirin
1 5 1 2005 aspirin
1 11 1 2005 aspirin
end
generate date = mdy(month,day,year)
format date %td
* code needs modification if any month is 1, 2, 3, 7, 8, 9
generate period = 1 if inlist(month,4,5,6)
replace period = 2 if inlist(month,10,11,12)
label define plab 1"1 April to 30 June" 2"1 October to 31 December"
label value period plab
bysort id year period (date): egen all_aspirin = min(drug == "aspirin")
by id year period: egen any_aspirin = max(drug == "aspirin")
by id year : gen both_all_aspirin = period[1] == 1 & period[_N] == 2 & all_aspirin[1] & all_aspirin[_N]
by id year : gen both_any_aspirin = period[1] == 1 & period[_N] == 2 & any_aspirin[1] & any_aspirin[_N]
list id date drug *aspirin
+----------------------------------------------------------------------+
| id date drug all_as~n any_as~n b~ll_a~n b~ny_a~n |
|----------------------------------------------------------------------|
1. | 1 01may2003 aspirin 1 1 1 1 |
2. | 1 01nov2003 aspirin 1 1 1 1 |
3. | 1 01jun2004 aspirin 1 1 0 0 |
4. | 1 01may2005 aspirin 1 1 1 1 |
5. | 1 01nov2005 aspirin 1 1 1 1 |
+----------------------------------------------------------------------+
As a style note, consider this example
generate dummy = 0
replace dummy = 1 if frog == 42
Experienced Stata programmers generally just write
generate dummy = frog == 42
See also this FAQ

Date periods based on first occurence

I have a pandas data frame of orders:
OrderID OrderDate Value CustomerID
1 2017-11-01 12.56 23
2 2017-11-06 1.56 23
3 2017-11-08 2.67 23
4 2017-11-12 5.67 99
5 2017-11-13 7.88 23
6 2017-11-19 3.78 99
Let's look at customer with ID 23.
His first order in the history was 2017-11-01. This date is a start date for his first week. It means that all his orders between 2017-11-01 and 2017-11-07 are assigned to his week number 1 (It IS NOT a calendar week like Monday to Sunday).
For customer with ID 99 first week starts 2017-11-12 of course as it is a date of his first order (OrderId 6).
I need to assign every order of the table to the respective index of the common table Periods. Periods[0] will contain orders from customer's weeks number 1, Periods[1] from customer's weeks number 2 etc.
OrderId 1 nad OrderId 6 will be in the same index of Periods table as both orders were created in first week of their customers.
Period table containig orders IDs has to look like this:
Periods=[[1,2,4],[3,5,6]]
Is this what you want ?
df['New']=df.groupby('CustomerID').OrderDate.apply(lambda x : (x-x.iloc[0]).dt.days//7)
df.groupby('New').OrderID.apply(list)
Out[1079]:
New
0 [1, 2, 4]
1 [3, 5, 6]
Name: OrderID, dtype: object
To get your period table
df.groupby('New').OrderID.apply(list).tolist()
Out[1080]: [[1, 2, 4], [3, 5, 6]]
More info
df
Out[1081]:
OrderID OrderDate Value CustomerID New
0 1 2017-11-01 12.56 23 0
1 2 2017-11-06 1.56 23 0
2 3 2017-11-08 2.67 23 1
3 4 2017-11-12 5.67 99 0
4 5 2017-11-13 7.88 23 1
5 6 2017-11-19 3.78 99 1

Converting daily data in to weekly in Pandas

I have a dataframe as given below:
Index Date Country Occurence
0 2013-12-30 US 1
1 2013-12-30 India 3
2 2014-01-10 US 1
3 2014-01-15 India 1
4 2014-02-05 UK 5
I want to convert daily data into weekly,grouped by anatomy,method being sum.
Itried resampling,but the output gave Multi Index data frame from which i was not able to access "Country" and "Date" columns(pls refer above)
The desired output is given below:
Date Country Occurence
Week1 India 4
Week2
Week1 US 2
Week2
Week5 Germany 5
You can groupby on country and resample on week
In [63]: df
Out[63]:
Date Country Occurence
0 2013-12-30 US 1
1 2013-12-30 India 3
2 2014-01-10 US 1
3 2014-01-15 India 1
4 2014-02-05 UK 5
In [64]: df.set_index('Date').groupby('Country').resample('W', how='sum')
Out[64]:
Occurence
Country Date
India 2014-01-05 3
2014-01-12 NaN
2014-01-19 1
UK 2014-02-09 5
US 2014-01-05 1
2014-01-12 1
And, you could use reset_index()
In [65]: df.set_index('Date').groupby('Country').resample('W', how='sum').reset_index()
Out[65]:
Country Date Occurence
0 India 2014-01-05 3
1 India 2014-01-12 NaN
2 India 2014-01-19 1
3 UK 2014-02-09 5
4 US 2014-01-05 1
5 US 2014-01-12 1