SAS - Cumulative sum with date range and conditions - sas

The following is an example of the data I have
startdate
enddate
amount
1/1/2010
2/2/2020
10
1/5/2011
2/3/2015
10
1/3/2012
2/2/2023
10
1/4/2013
2/2/2014
10
5/5/2015
2/2/2028
10
1/6/2016
2/2/2032
10
I want to calculate the sum of all existing amounts as of each start date so it should look like this:
startdate
amount
1/1/2010
10
1/5/2011
20
1/3/2012
30
1/4/2013
40
5/5/2015
30
1/6/2016
40
How do I do this in SAS?
Essentially what I want to do is for each of the start dates, calculate the cumulative sum of any amounts that haven't expired. So for the first four dates, it is just a running cumulative sum because none of the amounts have expired. But at 5/5/2015, two of the previous amounts have expired hence a cumulative sum of 30. Same for the last date, where the same two have previously expired and you have the additional amount as of 1/6/2016 therefore 40.

One way to accomplish this is with a self-join via Proc SQL:
proc sql;
create table out_dset as
select a.startdate, sum(a.amount) as amount
from in_dset as a left join in_dset as b
on a.startdate >= b.startdate and a.startdate < b.enddate
group by a.startdate
order by a.startdate;
quit;
For each observation in the original dataset, this code will find observations in the same dataset that meet the date range criteria and will sum up the amount column.
You can change the second comparison operator from < to <= if you want to include situations when a previous amount expired on the same date as a given startdate.

Related

Graph a daily amount in between two dates

The source table has a table with a single amount and a revenue start and revenue end date. I need to graph the amount over the period by day in PowerBI.
For example:
Looking at the second row the total amount is 730 but I need to calculate a daily rate and display this each day for the revenue period. So if I had a bar chart for this row I would need to show it as 16 April has 34.76, 17 April has 34.76 and so on until 6 May which is the revenue end date. I've tried using between dates but cant seem to get it working.
You can use Power BI's CALENDAR() function to create a table of dates ranging from the minimum revenue start date to the maximum revenue end date.
Dates = CALENDAR(MIN(BookFees[Revenue Start Date]),MAX(BookFees[Revenue End Date]))
Then you can create a calculated column in the Dates table for the daily revenue.
Daily Revenue = Calculate(SUM(BookFees[RevenueDayAmount]),FILTER(BookFees,BookFees[Revenue Start Date]<=Dates[Date] && BookFees[Revenue End Date]>= Dates[Date]))
Here is the resulting bar chart:

DAX Calcuate rolling sum

I have a problem with calculating measure that sums values for 3 previous periods.
Below I attach sample fact table and dict table to show problem I am facing.
date
customer
segment
value
01.01.2021
1
A
10
02.01.2021
1
A
10
03.01.2021
1
A
10
04.01.2021
1
A
10
01.01.2021
2
B
20
02.01.2021
2
B
30
03.01.2021
2
B
40
dict table:
segment
segment_desc
A
Name of A
B
Name of B
Approach I have taken:
last 3 value =
VAR DATES = DATESINPERIOD(facts[date],LASTDATE(facts[date]), -3,MONTH)
RETURN CALCULATE([sum value], DATES)
It produces correct results as long as there is at least one record for April.
When I use filter on segment_desc = 'B'
It produces result as I attached - so we see result in April equals 20, which is obviously not what I wanted. I would expect it to be 50.
Answer to the main question:
time intelligence functions like DATESINPERIOD require a proper calendar table, because they expect continuous dates, without gaps.
Answer to the follow-up question "why the measure shows value for January?"
It's a bit tricky. First, notice that LASTDATE in this filter context returns blank:
So, your DAX measure then becomes this:
last 3 value =
VAR DATES = DATESINPERIOD(facts[date], BLANK(), -3,MONTH)
RETURN CALCULATE([sum value], DATES)
Blank - 3 month does not make sense, so the way DAX resolves this: it replaces BLANK with the first (min) date in the table. In this case, it's 1/1/2021. Then it goes back 3 months from that date. As a result, the final measure is:
last 3 value =
CALCULATE([sum value], {2020-11-01, 2020-12-01, 2021-01-01 })
Since you have no data prior to 2021-01-01, the final result shows only January values.

DAX: How to count how many months have sales in a period

In my fact table (fTable) the columns I have are dates, region and sales.
dates
region
sales
-----
------
-----
I am visualizing the data in a pivot table with regions as rows and months as columns (I have a date table (dDate) with a months column in my model)
I am looking for a way to dynamically change the denominator in an averaging measure if a certain region doesn't have sales in a given month. Right now my denominator is hard-coded as 6, because I am averaging 6 variables in my nominator, but any one of them could be 0 if I don't have any sales in a certain month, in which case my denominator needs to change to 5, 4 or less depending on how many months I don't have sales in. So I am looking to count how many of the past 6 months have sales and sum that as the denominator.
I have managed to count months with sales this way:
Denominator:=
var newTable = Summarize(fTable,fTable[date (month)], fTable[region],"Sales",[Sum of Sales])
var MonthsWithSales = Countrows(newTable)
RETURN
MonthsWithSales
I've tried to RETURN
Calculate(SUMX(newTable,MonthsWithSales), Dateadd(dDate[Date],-6,MONTH)
but it yields a wrong result.
Any suggestions?
Thanks
Based on my sample, we can use function VALUES & COUNTROWS inside CALCULATE to get what we need:
Measure = CALCULATE( COUNTROWS(VALUES('Table (2)'[Month])), ALL('Table (2)'[Month]) )

Computing moving average in SAS

I'm trying to use SAS to compute a moving average for x number of periods that uses forecasted values in the calculation. For example if I have a data set with ten observations for a variable, and I wanted to do a 3-month moving average. The first forecast value should be an average of the last 3 observations, and the second forecast value should be an average of the last two observations, and the first forecast value.
If you have for example data like this:
data input;
infile datalines;
length product $10 period value 8;
informat period yymmdd10.;
format period yymmdd10.;
input product $ period value;
datalines;
car 2016-01-01 10
car 2015-12-01 20
car 2015-11-01 30
car 2015-10-01 40
car 2015-09-01 30
car 2015-08-01 15
;
run;
You can left join input table itself with a condition:
input t1 left join input t2
on t1.product = t2.product
and t2.period between intnx('month',t1.period,-2,'b') and t1.period
group by t1.product, t1.period, t1.value
With this you have t1.value as current value and avg(t2.value) as 3 months avg. To compute 2 months avg change every value that is older then previos period to missing value with ifn() function:
avg(ifn( t2.period >= intnx('month',t1.period,-1,'b'),t2.value,. ))
Full code could looks like this:
proc sql;
create table want as
select t1.product, t1.period, t1.value as currentValue,
ifn(count(t2.period)>1,avg(ifn( t2.period >= intnx('month',t1.period,-1,'b'),t2.value,. )),.) as twoMonthsAVG,
ifn(count(t2.period)>2,avg(t2.value),.) as threeMonthsAVG
from input t1 left join input t2
on t1.product = t2.product
and t2.period between intnx('month',t1.period,-2,'b') and t1.period
group by t1.product, t1.period, t1.value
;
quit;
I've also added count(t2.perion) condition to return missing values if I haven't got enough records to compute measure. My result set looks like this:

SAS Calculate last 15 days of withdrawals

I have a dataset full of transactions and each observation has account number, date, and transaction amount variables. Obviously multiple transactions will have the same account number.
I want to calculate the total transaction amount for each account number over the last 15 days for each transaction.
So my final dataset set will be a set of transactions with the following variables: account number, date, transaction amount, and total transaction amount over past 15 days.
Any ideas?
Thanks!
You can do it with proc SQL with a self merge, remove the where clause here, it's just for example.
This does actually do two passes of the data but it will be in one proc.
proc sql;
create table want as
select a.stock, a.date, a.open, sum(b.open) as total_open
from sashelp.stocks as a
left join sashelp.stocks as b
on a.date-b.date between 0 and 15
and a.stock=b.stock
where a.stock='IBM'
group by a.stock, a.date, a.open
order by a.stock, a.date;
quit;