SAS - rolling N months - sas

I am looking to use a factor based on YRMON (YYYY:MM) from a static list of factors that are based on the last 12 months (rolling forward every month)
I have a static dataset that looks this simple (see below):
I basically need to create a format that changes the yrmon to a rolling 13 months that I can stamp on the factor (for calculation needs) based on the yrmon.
yrmon facto
DATE00 1.00000
DATE01 0.99944
DATE02 0.99907
DATE03 0.99907
DATE04 0.99889
DATE05 0.99799
DATE06 0.99659
DATE07 0.99500
DATE08 0.99296
DATE09 0.99100
DATE10 0.85000
DATE11 0.78000
DATE12 0.34900
Example Data before/after:
YRMON PAYAMT PAYAMT_AFTER
2020:10 $100.00 $34.90
2020:08 $100.00 $85.00
Here's what I have so far:
/*Create macro dates*/
%let to_date = %sysfunc(date());
%let paydt = %sysfunc(intnx(month,&to_date, -(0)));
%let payday = %eval(&paydt. -1);
%let chirodt12 = %sysfunc(intnx(month,&payday,-0), date9.);
%let chirodt11 = %sysfunc(intnx(month,&payday,-1), date9.);
%let chirodt10 = %sysfunc(intnx(month,&payday,-2), date9.);
%let chirodt9 = %sysfunc(intnx(month,&payday,-3), date9.);
%let chirodt8 = %sysfunc(intnx(month,&payday,-4), date9.);
%let chirodt7 = %sysfunc(intnx(month,&payday,-5), date9.);
%let chirodt6 = %sysfunc(intnx(month,&payday,-6), date9.);
%let chirodt5 = %sysfunc(intnx(month,&payday,-7), date9.);
%let chirodt4 = %sysfunc(intnx(month,&payday,-8), date9.);
%let chirodt3 = %sysfunc(intnx(month,&payday,-9), date9.);
%let chirodt2 = %sysfunc(intnx(month,&payday,-10), date9.);
%let chirodt1 = %sysfunc(intnx(month,&payday,-11), date9.);
%let chirodt0 = %sysfunc(intnx(month,&payday,-12), date9.);
%put chirodt12 &chirodt12.;
%put chirodt11 &chirodt11.;
%put chirodt10 &chirodt10.;
%put chirodt9 &chirodt9.;
%put chirodt8 &chirodt8.;
%put chirodt7 &chirodt7.;
%put chirodt6 &chirodt6.;
%put chirodt5 &chirodt5.;
%put chirodt4 &chirodt4.;
%put chirodt3 &chirodt3.;
%put chirodt2 &chirodt2.;
%put chirodt1 &chirodt1.;
%put chirodt0 &chirodt0.;
filename ibnr "/path/to/file/file.txt";
data ibnr;
infile ibnr dlm='09'X dsd missover firstobs = 2;
informat month $6. factor 7.5;
input month factor;
run;
data ibnr;
set ibnr;
if month = 'DATE12' then yrmon = put("&chirodt12"D, yymmc7.);
else if month = 'DATE11' then yrmon = put("&chirodt11"D, yymmc7.);
else if month = 'DATE10' then yrmon = put("&chirodt10"D, yymmc7.);
else if month = 'DATE09' then yrmon = put("&chirodt9"D, yymmc7.);
else if month = 'DATE08' then yrmon = put("&chirodt8"D, yymmc7.);
else if month = 'DATE07' then yrmon = put("&chirodt7"D, yymmc7.);
else if month = 'DATE06' then yrmon = put("&chirodt6"D, yymmc7.);
else if month = 'DATE05' then yrmon = put("&chirodt5"D, yymmc7.);
else if month = 'DATE04' then yrmon = put("&chirodt4"D, yymmc7.);
else if month = 'DATE03' then yrmon = put("&chirodt3"D, yymmc7.);
else if month = 'DATE02' then yrmon = put("&chirodt2"D, yymmc7.);
else if month = 'DATE01' then yrmon = put("&chirodt1"D, yymmc7.);
else if month = 'DATE00' then yrmon = put("&chirodt0"D, yymmc7.);
run;
/*format for later use to stamp on factor for calculations*/
data chrofact (keep = fmtname label start type hlo);
retain fmtname 'chfact' type 'C';
set ibnr end = lastrec;
start=yrmon;
label = factor;
output;
if lastrec then do;
start='XXX' ;
hlo = 'O';
label = '0';
output;
end;
run;
proc format cntlin = chrofact;
run;
This gives me what I want; but I'm just sure there's a simpler method out there

Converting a text value to a SAS date value is an INPUT operation (i.e. INFORMAT), not a PUT operation (i.e. FORMAT).
Instead consider creating a lookup table that maps a literal string "DATE<nn>" to a SAS date value corresponding to some predetermined base date.
Example
Decode the <nn> portion of the string literal and apply it in a call to INTNX. Use the output as the basis of CNTLIN for INFORMAT or as a lookup table LEFT JOINED or HASH JOINED to some other data set containing string literals "DATE"
data have;
length yrmon $6 facto 8.;
input yrmon facto;
basedate = today();
month = input(substr(yrmon,5),2.);
month_ago = 12-month;
ago_date = intnx('month', basedate, -month_ago);
format ago_date yymmdd10.;
datalines;
DATE00 1.00000
DATE01 0.99944
DATE02 0.99907
DATE03 0.99907
DATE04 0.99889
DATE05 0.99799
DATE06 0.99659
DATE07 0.99500
DATE08 0.99296
DATE09 0.99100
DATE10 0.85000
DATE11 0.78000
DATE12 0.34900
;
Image of output data set in viewer

Related

How to find the previous 12Months date starting from last Month

I am trying to create a query to find number of occurrences in a list in a SAS dataset, for the past 12 Months starting from Last Month
I have created the macro below to be used in my WHERE clause:
%let cur_date = %sysfunc(today(), date9.);
%let pre_date2 = %sysfunc(putn(%sysfunc(intnx(month, %sysfunc(today()), -1, End)),%sysfunc(intnx(month, %sysfunc(today()), -12, End)) date9.)));
%put &pre_date4;
I would appreciate if you can help me with this.
Thanks
You need two macro variables: one for the end of the prior month and one for the first day 12 months prior to last month.
%let last_month = %sysfunc(intnx(month, %sysfunc(today()), -1, E) );
%let last_12_months = %sysfunc(intnx(month, &last_month., -12, B) );
Now you can run your query using between:
where date BETWEEN &last_month. AND &last_12_months.;
Example:
data have;
do i = -36 to 0;
date = intnx('month', today(), i, 'B');
output;
end;
format date date9.;
drop i;
run;
data want;
set have;
where date BETWEEN &last_month. AND &last_12_months.;
run;
Output:
date
01OCT2020
01NOV2020
01DEC2020
01JAN2021
01FEB2021
01MAR2021
01APR2021
01MAY2021
01JUN2021
01JUL2021
01AUG2021
01SEP2021

how to vertically sum a range of dynamic variables in sas?

I have a dataset in SAS in which the months would be dynamically updated each month. I need to calculate the sum vertically each month and paste the sum below, as shown in the image.
Proc means/ proc summary and proc print are not doing the trick for me.
I was given the following code before:
`%let month = month name;
%put &month.;
data new_totals;
set Final_&month. end=end;
&month._sum + &month._final;
/*feb_sum + &month._final;*/
output;
if end then do;
measure = 'Total';
&month._final = &month._sum;
/*Feb_final = feb_sum;*/
output;
end;
drop &month._sum;
run; `
The problem is this has all the months hardcoded, which i don't want. I am not too familiar with loops or arrays, so need a solution for this, please.
enter image description here
It may be better to use a reporting procedure such as PRINT or REPORT to produce the desired output.
data have;
length group $20;
do group = 'A', 'B', 'C';
array month_totals jan2020 jan2019 feb2020 feb2019 mar2019 apr2019 may2019 jun2019 jul2019 aug2019 sep2019 oct2019 oct2019 nov2019 dec2019;
do over month_totals;
month_totals = 10 + floor(rand('uniform', 60));
end;
output;
end;
run;
ods excel file='data_with_total_row.xlsx';
proc print noobs data=have;
var group ;
sum jan2020--dec2019;
run;
proc report data=have;
columns group jan2020--dec2019;
define group / width=20;
rbreak after / summarize;
compute after;
group = 'Total';
endcomp;
run;
ods excel close;
Data structure
The data sets you are working with are 'difficult' because the date aspect of the data is actually in the metadata, i.e. the column name. An even better approach, in SAS, is too have a categorical data with columns
group (categorical role)
month (categorical role)
total (continuous role)
Such data can be easily filtered with a where clause, and reporting procedures such as REPORT and TABULATE can use the month variable in a class statement.
Example:
data have;
length group $20;
do group = 'A', 'B', 'C';
do _n_ = 0 by 1 until (month >= '01feb2020'd);
month = intnx('month', '01jan2018'd, _n_);
total = 10 + floor(rand('uniform', 60));
output;
end;
end;
format month monyy5.;
run;
proc tabulate data=have;
class group month;
var total;
table
group all='Total'
,
month='' * total='' * sum=''*f=comma9.
;
where intck('month', month, '01feb2020'd) between 0 and 13;
run;
proc report data=have;
column group (month,total);
define group / group;
define month / '' across order=data ;
define total / '' ;
where intck('month', month, '01feb2020'd) between 0 and 13;
run;
Here is a basic way. Borrowed sample data from Richard.
data have;
length group $20;
do group = 'A', 'B';
array months jan2020 jan2019 feb2020 feb2019 mar2019 apr2019 may2019 jun2019 jul2019 aug2019 sep2019 oct2019 oct2019 nov2019 dec2019;
do over months;
months = 10 + floor(rand('uniform', 60, 1));
end;
output;
end;
run;
proc summary data=have;
var _numeric_;
output out=temp(drop=_:) sum=;
run;
data want;
set have temp (in=t);
if t then group='Total';
run;

Use a macro instead of 25 proc sql steps?

I have a SAS code (SQL) that has to repeat for 25 times; for each month/year combination (see code below). How can I use a macro in this code?
proc sql;
create table hh_oud_AUG_17 as
select hh_key
,sum(RG_count) as RG_count_aug_17
,case when sum(RG_count) >=2 then 1 else 0 end as loyabo_recht_aug_17
from basis_RG_oud
where valid_from_dt <= "01AUG2017"d <= valid_to_dt
group by hh_key
order by hh_key
;
quit;
proc sql;
create table hh_oud_SEP_17 as
select hh_key
,sum(RG_count) as RG_count_sep_17
,case when sum(RG_count) >=2 then 1 else 0 end as loyabo_recht_sep_17
from basis_RG_oud
where valid_from_dt <= "01SEP2017"d <= valid_to_dt
group by hh_key
order by hh_key
;
quit;
If you use a data step to do this, you can put all the desired columns in the same output dataset rather than using a macro to create 25 separate datasets:
/*Generate lists of variable names*/
data _null_;
stem1 = "RG_count_";
stem2 = "loyabo_recht_";
month = '01aug2017'd;
length suffix $4 vlist1 vlist2 $1000;
do i = 0 to 24;
suffix = put(intnx('month', month, i, 's'), yymmn4.);
vlist1 = catx(' ', vlist1, cats(stem1,suffix));
vlist2 = catx(' ', vlist2, cats(stem2,suffix));
end;
call symput("vlist1",vlist1);
call symput("vlist2",vlist2);
run;
%put vlist1 = &vlist1;
%put vlist2 = &vlist2;
/*Produce output table*/
data want;
if 0 then set have;
start_month = '01aug2017'd;
array rg_count[2, 0:24] &vlist1 &vlist2;
do _n_ = 1 by 1 until(last.hh_key);
set basis_RG_oud;
by hh_key;
do i = 0 to hbound2(rg_count);
if valid_from_dt <= intnx('month', start_month, i, 's') <= valid_to_dt
then rg_count[1,i] = sum(rg_count[1,i],1);
end;
end;
do _n_ = 1 to _n_;
set basis_RG_oud;
do i = 0 to hbound2(rg_count);
rg_count[2,i] = rg_count[1,i] >= 2;
end;
end;
run;
Create a second data set that enumerates (is a list of) the months to be examined. Cross Join the original data to that second data set. Create a single output table (or view) that contains the month as a categorical variable and aggregates based on that. You will be able to by-group process, classify or subset based on the month variable.
data months;
do month = '01jan2017'd to '31dec2018'd;
output;
month = intnx ('month', month, 0, 'E');
end;
format month monyy7.;
run;
proc sql;
create table want as
select
month, hh_key,
sum(RG_count) as RG_count,
case when sum(RG_count) >=2 then 1 else 0 end as loyabo_recht
from
basis_RG_oud
cross join
months
where
valid_from_dt <= month <= valid_to_dt
group
by month, hh_key
order
by month, hh_key
;
…
/* Some analysis */
BY MONTH;
…
/* Some tabulation */
CLASS MONTH;
TABLE … MONTH …
WHERE year(month) = 2018;

SAS Fast forward a date until a limit using INTNX/INTCK

I'm looking to take a variable observation's date and essentially keep rolling it forward by its specified repricing parameter until a target date
the dataset being used is:
data have;
input repricing_frequency date_of_last_repricing end_date;
datalines;
3 15399 21367
10 12265 21367
15 13879 21367
;
format date_of_last_repricing end_date date9.;
informat date_of_last_repricing end_date date9.;
run;
so the idea is that i'd keep applying the repricing frequency of either 3 months, 10 months or 15 months to the date_of_last_repricing until it is as close as it can be to the date "31DEC2017". Thanks in advance.
EDIT including my recent workings:
data want;
set have;
repricing_N = intck('Month',date_of_last_repricing,'31DEC2017'd,'continuous');
dateoflastrepricing = intnx('Month',date_of_last_repricing,repricing_N,'E');
format dateoflastrepricing date9.;
informat dateoflastrepricing date9.;
run;
The INTNX function will compute an incremented date value, and allows the resultant interval alignment to be specified (in your case the 'end' of the month n-months hence)
data have;
format date_of_last_repricing end_date date9.;
informat date_of_last_repricing end_date date9.;
* use 12. to read the raw date values in the datalines;
input repricing_frequency date_of_last_repricing: 12. end_date: 12.;
datalines;
3 15399 21367
10 12265 21367
15 13879 21367
;
run;
data want;
set have;
status = 'Original';
output;
* increment and iterate;
date_of_last_repricing = intnx('month',
date_of_last_repricing, repricing_frequency, 'end'
);
do while (date_of_last_repricing <= end_date);
status = 'Computed';
output;
date_of_last_repricing = intnx('month',
date_of_last_repricing, repricing_frequency, 'end'
);
end;
run;
If you want to compute only the nearest end date, as when iterating by repricing frequency, you do not have to iterate. You can divide the months apart by the frequency to get the number of iterations that would have occurred.
data want2;
set have;
nearest_end_month = intnx('month', end_date, 0, 'end');
if nearest_end_month > end_date then nearest_end_month = intnx('month', nearest_end_month, -1, 'end');
months_apart = intck('month', date_of_last_repricing, nearest_end_month);
iterations_apart = floor(months_apart / repricing_frequency);
iteration_months = iterations_apart * repricing_frequency;
nearest_end_date = intnx('month', date_of_last_repricing, iteration_months, 'end');
format nearest: date9.;
run;
proc sql;
select id, max(date_of_last_repricing) as nearest_end_date format=date9. from want group by id;
select id, nearest_end_date from want2;
quit;

SAS: attempting to build a loop for uploading multiple files

I'm attempting to build a loop in SAS to upload several files, and am running into a few issues to work through. Current code:
%Macro Weatherupload(File=, output=);
proc import datafile = &File;
out = &output;
dbms=dlm replace;
delimiter= ",";
getnames=yes;
guessingrows = 1000;
run;
%Mend Weatherupload;
%Macro WeatherPrepare(input=, output=);
data &output (keep=Wban_Number _YearMonthDay DewPoint Temp _Avg_Dew_Pt _Avg_Temp year month day);
set &input;
DewPoint = Input(compress(_Avg_Dew_Pt,"*"), 3.);
Temp = Input(compress(_Avg_Temp,"*"), 3.);
year = (_yearmonthday - mod(_yearmonthday, 10000))/10000;
month = ((_yearmonthday - mod(_yearmonthday, 100)) - (_yearmonthday - mod(_yearmonthday,10000)))/100;
day = mod(_yearmonthday, 100);
drop _Avg_Dew_Pt _Avg_Temp _YearMonthDay;
run;
%Mend WeatherPrepare;
data temperatures;
do i = 1999 to 2015;
do j = 1 to 12;
name = 'C:\Users\DILLON.SAXE\Documents\'||i||j||'.tar'||' \'||i||j||'daily.txt';
output = i||j||'weather';
final = i||j||'final';
%Weatherupload(File=name, output=output)
%WeatherPrepare(input=output, output=final)
end;
end;
run;
The goal is to run through several files, in several folders, listed in month + day + rest of title, and (at the moment) upload two variables of data from them. Later I will want to add in merging the files, and doing some more data work, but for the moment it's the macro issues and uploading that are holding it up.
Is there a way to either use proc upload in a loop, or use another data step in the loop?
I get the error "more positional variables than (something)" (I forget exact error, but it lists positional variables). I've tried adding and removing commas in the macros, but have not been able to get rid of this error. Any ideas?
I don't think you can call macro's like you have in your data step. I think you're intending to use Call Execute.
data temperatures;
do i = 1999 to 2015;
do j = 1 to 12;
name = 'C:\Users\DILLON.SAXE\Documents\'||i||j||'.tar'||' \'||i||j||'daily.txt';
output = i||j||'weather';
final = i||j||'final';
call execute('%Weatherupload(File='||name||', output='||output||')');
call execute('%WeatherPrepare(input='||output||', output='||final||')');
end;
end;
run;
Alternatively, assuming you're trying to read all files in a folder, I think you should be creating a list of file names in a data set, use a data step with the filename option to input all files at once instead. Here's a brief method on how to do it if all where in a single folder: https://communities.sas.com/docs/DOC-10426
Here is a page that has code to get a list of files into a data set
http://www.sascommunity.org/wiki/Making_Lists
since your macros have neither conditionals (%if) nor loops (%do)
then I suggest you use them as parameterized %incudes
Here is a tool to read the list-of-files data set and call a program
http://www.sascommunity.org/wiki/Call_Execute_Parameterized_Include
note: in proc import always set guessingrows to the max value;
in v9.3 that is 2147483647;
Got it sorted out, based on the first answer. Eventual code:
%Macro Weatherupload(File=, output=);
proc import datafile = "&File"
out = &output
dbms=dlm replace;
delimiter= ",";
getnames=yes;
guessingrows = 1000;
run;
%Mend Weatherupload;
%Macro WeatherPrepare(input=, output=);
data &output;
set &input;
DewPoint = Input(compress(_Avg_Dew_Pt,"*"), 3.);
Temp = Input(compress(_Avg_Temp,"*"), 3.);
year = (_yearmonthday - mod(_yearmonthday, 10000))/10000;
month = ((_yearmonthday - mod(_yearmonthday, 100)) - (_yearmonthday - mod(_yearmonthday,10000)))/100;
day = mod(_yearmonthday, 100);
keep Wban_Number DewPoint Temp year month day;
run;
%Mend WeatherPrepare;
%Macro WeatherPrepare2(input=, output=);
data &output;
set &input;
DewPoint = Input(DewPoint, 3.);
Temp = Input(compress(_Avg_Temp,"*"), 3.);
year = (_yearmonthday - mod(_yearmonthday, 10000))/10000;
month = ((_yearmonthday - mod(_yearmonthday, 100)) - (_yearmonthday - mod(_yearmonthday,10000)))/100;
day = mod(_yearmonthday, 100);
Wban_Number = Wban;
keep Wban_Number DewPoint Temp year month day;
run;
%Mend WeatherPrepare;
%Macro Append(merge=);
data temperatures;
set temperatures &merge;
%Mend Append;
data temperatures;
do i = 1999 to 2015;
do j = 1 to 12;
jzero = put(j, z2.);
name = compress('C:\Users\DILLON.SAXE\Documents\'||i||jzero||'.tar'||'\'||i||jzero||'daily.txt');
name2 = compress('C:\Users\DILLON.SAXE\Documents\'||'QCLCD'||i||jzero||'\'||i||jzero||'daily.txt');
output = compress('weather'||i||j);
final = compress('final'||i||j);
if 1000*i+j < 200708 then
do;
call execute('%Weatherupload(File='||name||', output='||output||')');
call execute('%WeatherPrepare(input='||output||', output='||final||')');
end;
else
do;
call execute('%Weatherupload(File='||name2||', output='||output||')');
call execute('%WeatherPrepare2(input='||output||', output='||final||')');
end;
call execute('%Append(merge='||final||')');
end;
end;
drop i j jzero name name2 output final;
run;