Macro Do loop not working properly in RSUBMIT - sas

So I have this code that works well for one year, but I need to convert it as a loop so it works for years from 1970 to 2015.
Here is the code for 1 year that I specify in a %let statement.
%let year=1970
rsubmit;
data home.historical_returns_&year;
set home.crspdata;
where (year <= &year - 1) and (year >= &year - 5);
returns_count + 1;
by id year;
if first.id or missing(tot_ret) then returns_count = 1;
run;
endrsubmit;
So far, that code works great for me. Now, I am trying to use a loop so I do it for year 1970 to 2015.
I have came up with this. Which looks like it works great, but the year stays at 1970.
%macro GMV;
rsubmit;
%do year=1970 %to 2015;
data home.historical_returns_&year;
set home.crspdata;
where (year <= &year - 1) and (year >= &year - 5);
returns_count + 1;
by id year;
if first.id or missing(tot_ret) then returns_count = 1;
run;
%end;
endrsubmit;
%mend GMV;
%GMV
In the log, I see that the &year in the name never actually changes from 1970 to 1971 to 1972 and so on. So I do not end up with the 45 different datasets that I need.
Anybody ever had this problem?
Thank you!

You're mixing up remote processing with local processing in a way that's going to cause problems like this. Your macro variable won't be updated (and I'm a bit surprised it's not throwing an error about the %do loop, personally).
rsubmit;
%macro GMV;
%do year=1970 %to 2015;
data home.historical_returns_&year;
set home.crspdata;
where (year <= &year - 1) and (year >= &year - 5);
returns_count + 1;
by id year;
if first.id or missing(tot_ret) then returns_count = 1;
run;
%end;
%mend GMV;
%GMV
endrsubmit;
Put the whole macro in the rsubmit to get the result you're looking for - or put the whole rsubmit in the macro (not as good of an idea in my opinion, though Tom in comments notes that it might be the safer option in some cases).

If you want to reference a macro variable in the code that you RSUBMIT then the macro variable needs to exist in the remote session.
%macro GMV(start,end);
%local year;
%do year=&start %to &end;
%syslput year=&year;
rsubmit;
data home.historical_returns_&year;
set home.crspdata;
by id year;
where (year <= &year - 1) and (year >= &year - 5);
returns_count + 1;
if first.id or missing(tot_ret) then returns_count = 1;
run;
endrsubmit;
%end;
%mend GMV;
%GMV(1970,2015);

Related

Iterate date in loop in SAS

need help on one query , I have to iterate date in do loop that is in format of yymmd6.(202112) so that once the month reach to 12 then its automatically change to next year first month.
///// code////////
%let startmo=202010 ;
%let endmo= 202102;
%macro test;
%do month= &startmo %to &endmo;
Data ABC_&month;
Set test&month;
X=&month ;
%end;
Run;
%mend;
%test;
//////////
Output should be 5 dataset as
ABC_202010
ABC_202011
ABC_202012
ABC_202101
ABC_20210
I need macro variable month to be resolved 202101 once it reached to 202012
Those are not actual DATE values. Just strings that you have imposed your own interpretation on so that they LOOK like dates to you.
Use date values instead and then it is easy to generate strings in the style you need by using a FORMAt.
%macro test(startmo,endmo);
%local offset month month_string;
%do offset = 0 to %sysfunc(intck(month,&startmo,&endmo));
%let month=%sysfunc(intnx(month,&startmo,&offset));
%let month_string=%sysfunc(putn(&month,yymmn6.));
data ABC_&month_string;
set test&month_string;
X=&month ;
format X monyy7.;
run;
%end;
%mend;
%test(startmo='01OCT2020'd , endmo='01FEB2021'd)
And if you need to convert one of those strings into a date value use an INFORMAT.
%let date=%sysfunc(inputn(202010,yymmn6.));
I would prefer to use a do while loop.
check whether the last 2 characters are 12, if so, change the month part to 01.
code
%let startmo=202010 ;
%let endmo= 202102;
%macro test;
%do %while(&startmo <= &endmo);
Data ABC_&startmo;
Set test&startmo;
X=&startmo ;
Run;
%end;
%let mon = %substr(&startmo, 5, 2);
%let yr = %substr(&startmo, 1, 4);
%if &mon = 12 %then %do;
%let m = 01;
%let startmo = %sysfunc(cat(%eval(&yr + 1), &m));
%end;
%else %do;
%let startmo = %eval(&startmo + 1);
%end;
%mend;
%test;

SAS - defining variable as column sum and bootstrapping

I have a chunk to bootstrap a dataset.
%let iter = 2;
%let seed = 777;
data work.seg;
input segment $3. prem loss;
datalines;
AAA 5000 0
AAA 3000 12584
AAA 200 245
AAA 500 678
;
data work.test;
do i=1 to &iter;
sumprem=0;
do _n_=1 to 1000000 until (sumprem>=8700);
row_i=int(ranuni(&seed)*n)+1;
set work.seg point=row_i nobs=n;
sumprem + prem;
output;
end;
end;
stop;
run;
It works, but I have a few questions.
How can I make the 400 number dynamic... I want (sumprem >= 8700) to be (sumprem >= &threshold) where &threshold is the sum of the prem column.
Is it correct how I am passing the &seed? Or should (&seed) be replaced with something like (&seed + _n_)?
How can I make the last data step into a macro... something like below, but I haven't gotten anything to work.
%macro boot(data, iter, seed);
%do i=1 %to &iter;
sumprem=0;
%do _n_=1 %to 1000000 %until (sumprem>=8700);
row_i=int(ranuni(&seed)*n)+1;
set work.seg point=row_i nobs=n;
sumprem + prem;
output;
%end;
%end;
%mend;
I assume you want to calculate the sum of prem from work.seq?
proc sql noprint ;
select sum(prem) format=best32. into :threshold trimmed from seg ;
quit;
Your macro code is confusing macro logic, which can be used to generate code, and data step logic which is what operates on the actual data. To make it into a macro just use the same macro variable names for the names of the parameters to the macro and leave the code the same.
%macro boot(in,out, iter, seed);
data &out;
do until (eof);
set &in end=eof;
threshold + prem ;
end;
do i=1 to &iter;
sumprem=0;
do _n_=1 to 1000000 until (sumprem>=threshold);
row_i=int(ranuni(&seed)*n)+1;
set &in point=row_i nobs=n;
sumprem + prem;
output;
end;
end;
stop;
run;
%mend boot;

Find three most recent data year for each row

I have a data set with one row for each country and 100 columns (10 variables with 10 data years each).
For each variable I am trying to make a new data set with the three most recent data years for that variable for each country (which might not be successive).
This is what I have so far, but I know its wrong because of the nest loop, and its has same value for recent1 recent2 recent3 however I haven't figured out how to create recent1 recent2 recent3 without two loops.
%macro test();
data Maternal_care_recent;
set wb;
keep country MATERNAL_CARE_2004 -- MATERNAL_CARE_2013 recent_1 recent_2 recent_3;
%let rc = 1;
%do i = 2013 %to 2004 %by -1;
%do rc = 1 %to 3 %by 1;
%if MATERNAL_CARE_&i. ne . %then %do;
recent_&rc. = MATERNAL_CARE_&i.;
%end;
%end;
%end; run; %mend; %test();
You don't need to use a macro to do this - just some arrays:
data Maternal_care_recent;
set wb;
keep country MATERNAL_CARE_2004-MATERNAL_CARE_2013 recent_1 recent_2 recent_3;
array mc {*} MATERNAL_CARE_2004-MATERNAL_CARE_2013;
array recent {*} recent1-recent3;
do i = 2013 to 2004 by -1;
do rc = 1 to 3 by 1;
if mc[i] ne . then do;
recent[rc] = mc[i];
end;
end;
run;
Maybe I don't get your request, but according to your description:
"For each variable I am trying to make a new data set with the three most recent data years for that variable for each country (which might not be successive)" I created this sample dataset with dt1 and dt2 and 2 locations.
The output will be 2 datasets (and generally the number of the variables starting with DT) named DS1 and DS2 with 3 observations for each country, the first one for the first variable, the second one for the second variable.
This is the sample dataset:
data sample_ds;
length city $10 dt1 dt2 8.;
infile datalines dlm=',';
input city $ dt1 dt2;
datalines;
MS,5,0
MS,3,9
MS,3,9
MS,2,0
MS,1,8
MS,1,7
CA,6,1
CA,6,.
CA,6,.
CA,2,8
CA,1,5
CA,0,4
;
This is the sample macro:
%macro help(ds=);
data vars(keep=dt:); set &ds; if _n_ not >0; run;
%let op = %sysfunc(open(vars));
%let nvrs = %sysfunc(attrn(&op,nvars));
%let cl = %sysfunc(close(&op));
%do idx=1 %to &nvrs.;
proc sort data=&ds(keep=city dt&idx.) out=ds&idx.(where=(dt&idx. ne .)) nodupkey; by city DESCENDING dt&idx.; run;
data ds&idx.; set ds&idx.;
retain cnt;
by city DESCENDING dt&idx.;
if first.city then cnt=0; else cnt=cnt+1;
run;
data ds&idx.(drop=cnt); set ds&idx.(where=(cnt<3)); rename dt&idx.=act&idx.; run;
%end;
%mend;
You will run this macro with:
%help(ds=sample_ds);
In the first statement of the macro I select the variables on which I want to iterate:
data vars(keep=dt:); set &ds; if _n_ not >0; run;
Work on this if you want to make this work for your code, or simply rename your variables as DT1 DT2...
Let me know if it is correct for you.
When writing macro code, always keep in mind what has to be done when. SAS processes your code stepwise.
Before your sas code is even compiled, your macro variables are resolved and your macro code is executed
Then the resulting SAS Base code is compiled
Finally the code is executed.
When you write %if MATERNAL_CARE_&i. ne . %then %do, this is macro code interpreded before compilation.
At that time MATERNAL_CARE_&i. is not a variable but a text string containing a macro variable.
The first time you run trhough your %do i = 2013 %to 2004 by -1, it is filled in as MATERNAL_CARE_2013, the second as MATERNAL_CARE_2012., etc.
Then the macro %if statement is interpreted, and as the text string MATERNAL_CARE_1 is not equal to a dot, it is evaluated to FALSE
and recent_&rc. = MATERNAL_CARE_&i. is not included in the code to pass to your compiler.
You can see that if you run your code with option mprint;
The resolution;
options mprint;
%macro test();
data Maternal_care_recent;
set wb;
keep country MATERNAL_CARE_: recent_:;
** The : acts as a wild card here **;
%do i = 2013 %to 2004 %by -1;
if MATERNAL_CARE_&i. ne . then do;
%do rc = 1 %to 3 %by 1;
recent_&rc. = MATERNAL_CARE_&i.;
%end;
end;
%end;
run;
%mend;
%test();
Now, before compilation of if MATERNAL_CARE_&i. ne . then do, only the &i. is evalueated and if MATERNAL_CARE_2013 ne . then do is passed to the compiler.
The compiler will see this as a test if the SAS variable MATERNAL_CARE_1 has value missing, and that is just what you wanted;
Remark:
It is not essential that I moved the if statement above the ``. It is just more efficient because the condition is then evaluated less often.
It is however essential that you close your %ifs and %dos with an %end and your ifs and dos with an end;
Remark:
you do not need %let rc = 1, because %do rc = 1 to 3 already initialises &rc.;
For completeness SAS is compiled stepwise:
The next PROC or data step and its macro code are only considered when the preveous one is executed.
That is why you can write macro variables from a data step or sql select into that will influence the code you compile in your next step,
somehting you can not do for instance with C++ pre compilation;
Thanks everyone. Found a hybrid solution from a few solutions posted.
data sample_ds;
infile datalines dlm=',';
input country $ maternal_2004 maternal_2005
maternal_2006 maternal_2007 maternal_2008 maternal_2009 maternal_2010 maternal_2011 maternal_2012 maternal_2013;
datalines;
MS,5,0,5,0,5,.,5,.,5,.
MW,3,9,5,0,5,0,5,.,5,0
WE,3,9,5,0,5,.,.,.,.,0
HU,2,0,5,.,5,.,5,0,5,0
MI,1,8,5,0,5,0,5,.,5,0
HJ,1,7,5,0,5,0,.,0,.,0
CJ,6,1,5,0,5,0,5,0,5,0
CN,6,1,.,5,0,5,0,5,0,5
CE,6,5,0,5,0,.,0,5,.,8
CT,2,5,0,5,0,5,0,5,0,9
CW,1,5,0,5,0,5,.,.,0,7
CH,0,5,0,5,0,.,0,.,0,5
;
%macro test(var);
data &var._recent;
set sample_ds;
keep country &var._1 &var._2 &var._3;
array mc {*} &var._2004-&var._2013;
array recent {*} &var._1-&var._25;
count=1;
do i = 10 to 1 by -1;
if mc[i] ne . then do;
recent[count] = mc[i];
count=count+1;
end;
end;
run;
%mend;

SAS: attempting to build a loop for uploading multiple files

I'm attempting to build a loop in SAS to upload several files, and am running into a few issues to work through. Current code:
%Macro Weatherupload(File=, output=);
proc import datafile = &File;
out = &output;
dbms=dlm replace;
delimiter= ",";
getnames=yes;
guessingrows = 1000;
run;
%Mend Weatherupload;
%Macro WeatherPrepare(input=, output=);
data &output (keep=Wban_Number _YearMonthDay DewPoint Temp _Avg_Dew_Pt _Avg_Temp year month day);
set &input;
DewPoint = Input(compress(_Avg_Dew_Pt,"*"), 3.);
Temp = Input(compress(_Avg_Temp,"*"), 3.);
year = (_yearmonthday - mod(_yearmonthday, 10000))/10000;
month = ((_yearmonthday - mod(_yearmonthday, 100)) - (_yearmonthday - mod(_yearmonthday,10000)))/100;
day = mod(_yearmonthday, 100);
drop _Avg_Dew_Pt _Avg_Temp _YearMonthDay;
run;
%Mend WeatherPrepare;
data temperatures;
do i = 1999 to 2015;
do j = 1 to 12;
name = 'C:\Users\DILLON.SAXE\Documents\'||i||j||'.tar'||' \'||i||j||'daily.txt';
output = i||j||'weather';
final = i||j||'final';
%Weatherupload(File=name, output=output)
%WeatherPrepare(input=output, output=final)
end;
end;
run;
The goal is to run through several files, in several folders, listed in month + day + rest of title, and (at the moment) upload two variables of data from them. Later I will want to add in merging the files, and doing some more data work, but for the moment it's the macro issues and uploading that are holding it up.
Is there a way to either use proc upload in a loop, or use another data step in the loop?
I get the error "more positional variables than (something)" (I forget exact error, but it lists positional variables). I've tried adding and removing commas in the macros, but have not been able to get rid of this error. Any ideas?
I don't think you can call macro's like you have in your data step. I think you're intending to use Call Execute.
data temperatures;
do i = 1999 to 2015;
do j = 1 to 12;
name = 'C:\Users\DILLON.SAXE\Documents\'||i||j||'.tar'||' \'||i||j||'daily.txt';
output = i||j||'weather';
final = i||j||'final';
call execute('%Weatherupload(File='||name||', output='||output||')');
call execute('%WeatherPrepare(input='||output||', output='||final||')');
end;
end;
run;
Alternatively, assuming you're trying to read all files in a folder, I think you should be creating a list of file names in a data set, use a data step with the filename option to input all files at once instead. Here's a brief method on how to do it if all where in a single folder: https://communities.sas.com/docs/DOC-10426
Here is a page that has code to get a list of files into a data set
http://www.sascommunity.org/wiki/Making_Lists
since your macros have neither conditionals (%if) nor loops (%do)
then I suggest you use them as parameterized %incudes
Here is a tool to read the list-of-files data set and call a program
http://www.sascommunity.org/wiki/Call_Execute_Parameterized_Include
note: in proc import always set guessingrows to the max value;
in v9.3 that is 2147483647;
Got it sorted out, based on the first answer. Eventual code:
%Macro Weatherupload(File=, output=);
proc import datafile = "&File"
out = &output
dbms=dlm replace;
delimiter= ",";
getnames=yes;
guessingrows = 1000;
run;
%Mend Weatherupload;
%Macro WeatherPrepare(input=, output=);
data &output;
set &input;
DewPoint = Input(compress(_Avg_Dew_Pt,"*"), 3.);
Temp = Input(compress(_Avg_Temp,"*"), 3.);
year = (_yearmonthday - mod(_yearmonthday, 10000))/10000;
month = ((_yearmonthday - mod(_yearmonthday, 100)) - (_yearmonthday - mod(_yearmonthday,10000)))/100;
day = mod(_yearmonthday, 100);
keep Wban_Number DewPoint Temp year month day;
run;
%Mend WeatherPrepare;
%Macro WeatherPrepare2(input=, output=);
data &output;
set &input;
DewPoint = Input(DewPoint, 3.);
Temp = Input(compress(_Avg_Temp,"*"), 3.);
year = (_yearmonthday - mod(_yearmonthday, 10000))/10000;
month = ((_yearmonthday - mod(_yearmonthday, 100)) - (_yearmonthday - mod(_yearmonthday,10000)))/100;
day = mod(_yearmonthday, 100);
Wban_Number = Wban;
keep Wban_Number DewPoint Temp year month day;
run;
%Mend WeatherPrepare;
%Macro Append(merge=);
data temperatures;
set temperatures &merge;
%Mend Append;
data temperatures;
do i = 1999 to 2015;
do j = 1 to 12;
jzero = put(j, z2.);
name = compress('C:\Users\DILLON.SAXE\Documents\'||i||jzero||'.tar'||'\'||i||jzero||'daily.txt');
name2 = compress('C:\Users\DILLON.SAXE\Documents\'||'QCLCD'||i||jzero||'\'||i||jzero||'daily.txt');
output = compress('weather'||i||j);
final = compress('final'||i||j);
if 1000*i+j < 200708 then
do;
call execute('%Weatherupload(File='||name||', output='||output||')');
call execute('%WeatherPrepare(input='||output||', output='||final||')');
end;
else
do;
call execute('%Weatherupload(File='||name2||', output='||output||')');
call execute('%WeatherPrepare2(input='||output||', output='||final||')');
end;
call execute('%Append(merge='||final||')');
end;
end;
drop i j jzero name name2 output final;
run;

LOOP in SAS with date

I have posted similar kind of question of loop earlier.
Here I have to loop for 2011 to 2022, But for year 2011 the calculation is different from year 2012 to 2022. For year 2012 onwards the cost_2012 is depends on cost_2011 and cost_2013 depends on cost 2012..I tried with this code but am getting error msg.
%MACRO NFORE1;
proc sql;
create table cost_news_&time as
select *
,case (31DEC2011.d-EIS)/365<=10 then Segment_0_10
(31DEC2011.d-EIS)/365<=20 then Segment_10_20
else note end as cost_AGE_2011
,Latest_cost+('31DEC2011'd-Latest_cost_Date)/30.44*cost_AGE_2011 as cost_2011
%DO TIME=2012 %TO 2022;
%LET ltime=%eval(&time-1);
,case
("31DEC&time."d-EIS)/365<=10 then Segment_0_10
("31DEC&time."d-EIS)/365<=20 then Segment_10_20
else note end as cost_AGE_&time
,case
calculated cost_&ltime + calculated cost_AGE_&time * 12 as cost_&time
from cost_news
;
quit;
%END;
%MEND NFORE1;
%NFORE1;
data cost_news;
length EIS Latest_cost Latest_cost_Date Segment_0_10 Segment_10_20 Note 8;
run;
options mprint;
%MACRO NFORE1;
proc sql;
create table cost_news_ as
select *
,case when ("31DEC2011"d-EIS)/365<=10 then Segment_0_10
when ("31DEC2011"d-EIS)/365<=20 then Segment_10_20
else note end as cost_AGE_2011
,Latest_cost+('31DEC2011'd-Latest_cost_Date)/30.44* calculated cost_AGE_2011 as cost_2011
%DO TIME=2012 %TO 2022;
%LET ltime=%eval(&time-1);
,case
when ("31DEC&time"d-EIS)/365<=10 then Segment_0_10
when ("31DEC&time"d-EIS)/365<=20 then Segment_10_20
else note end as cost_AGE_&time
, calculated cost_&ltime + calculated cost_AGE_&time * 12 as cost_&time
%END;
from cost_news
;
quit;
%MEND NFORE1;
%NFORE1;
Are all your variables numeric? Because you use them in such a way, including NOTE.
My changes:
- FROM clause was repeated by the DO cycle - you only need one, right? Also QUIT statement.
Changed 31DEC2011.d to "31DEC2011"d.
added WHEN to CASE expressions
Added calculated keyword to cost_AGE_2011.
Anyway, datastep programming would be much cleaner for this.