Increment variable name - sas

Is it possible to increment different prefixed variable names in a simple way? For example, if my dataset has columns for Score1 all the way to Score20, I can simply do:
input Score1-Score20;
But what if I have Score1 Rank1 Total1 to Score20 Rank20 Total20, is there a way to increment these without manually typing out each one? So the result would look like:
Score1 Rank1 Total1 Score2 Rank2 Total2 Score3 Rank3 Total3 etc...

Do you care if the variables are created in a different order than in the input file? If not then use an ARRAY. Try this example.
data x ;
array x(3,20) a1-a20 b1-b20 c1-c20 ;
infile cards truncover;
do block=1 to 20;
do item=1 to 3;
input x(item,block) #;
end;
end;
put (_all_) (=);
list;
cards;
1 1 1 2 2 2 3 3 3 4 4 4 5 5 5
;
If you need them in that order then you need to use some type of code generation.
You could create a simple function style macro to emit the list of names.
%macro namelist(baselist,n);
%local i j;
%do i=1 %to &n ;
%do j=1 %to %sysfunc(countw(&baselist));
%scan(&baselist,&j)&i
%end;
%end;
%mend namelist;
...
input %namelist(Rank Total Score,20) ;
Or you could use a simple data step to build the list into a macro variable.
data _null_;
length i 8 basename $30 namelist $32000;
do i=1 to 20 ;
do basename='Rank ','Total','Score';
namelist=catx(' ',namelist,cats(basename,i));
end;
end;
call symputx('namelist',namelist);
run;
...
input &namelist ;

You could probably do a macro for this. I think this would work:
%macro mymacro(runs);
%do i=1 %to &runs;
input Score&i Rank&i Total&i;
run;
%end;
%mend create;
%mymacro(20)
Try here for better documentation: https://support.sas.com/documentation/cdl/en/mcrolref/61885/HTML/default/viewer.htm#a000543755.htm

The same question was asked here earlier in the week, the answer is the same.
https://communities.sas.com/t5/General-SAS-Programming/Variables-listing/m-p/238350#M34601
You could create a macro to create the names.
data _null_;
length var $1000.;
do i=1 to 10;
var=catt(var, " Total"||put(i, 2. -l), " Male"||put(i, 2. -l), " Female"||put(i, 2. -l));
end;
call symputx('input_list', var);
run;
%put &input_list;

Related

how can I build a loop for macro in SAS?

I want to do a simulation based on macro in SAS. I can build a function named 'fine()', the code is as follows
DATA CLASS;
INPUT NAME $ SEX $ AGE HEIGHT WEIGHT;
CARDS;
ALFRED M 14 69.0 112.5
ALICE F 13 56.5 84.0
BARBARA F 13 65.3 98.0
CAROL F 14 62.8 102.5
HENRY M 14 63.5 102.5
RUN;
PROC PRINT;
TITLE 'DATA';
RUN;
proc print data=CLASS;run;
PROC FCMP OUTLIB = work.functions.func;
function populationCalc(HEIGHT,WEIGHT,thres);
pop=HEIGHT-WEIGHT-thres;
return (pop);
ENDSUB;
options cmplib=(work.functions);
%macro fine(i);
data ex;
set CLASS;
thres=&i;
pop = populationCalc(HEIGHT,WEIGHT,thres);
if (pop>50) then score=1;
else score=0;
run;
proc iml;
USE ex;
READ all var _ALL_ into ma[colname=varNames];
CLOSE ex;
nn=nrow(ma);
total_score=sum(ma[,'thres']);
avg_score=sum(ma[,'thres'])/nn;
print total_score avg_score;
%mend fine;
%fine(10);
%fine(100);
%fine(150);
I want to build a loop for function 'fine()' ans also use macro, but the result is not as I expect. How can I fix this?
%macro ct(n);
data data_want;
%do i=1 %to &n;
x=%fine(&i);
output x;
%end;
run;
%macro ct;
%ct(10);
%fine does not generate any text that can be used in the context of a right hand side (RHS) of a DATA Step variable assignment statement.
You seem to perhaps want this data set as a result of invoking %ct
i total_score average_score
- ----------- -------------
1 5 1
2 10 2
3 15 3
etc...
Step 1. Save IML result
Add this to the bottom of IML code in %fine, replacing the print
create fine_out var {total_score avg_score};
append;
close fine_out;
quit;
Step 2. Rewrite ct macro
Invoke %fine outside a DATA step context so the DATA and IML steps can run. Append the IML output to a results data set.
%macro ct(n,out=result);
%local i;
%do i=1 %to &n;
%fine(&i)
%if &i = 1 %then %do;
data &out; set fine_out; run;
%end;
%else %do;
proc append base=&out data=fine_out; run;
%end;
%end;
%mend;
options mprint;
%ct(10)
This should be the output WORK.RESULT based on your data

Do loop for creating new variables in SAS

I am trying to run this code
data swati;
input facility_id$ loan_desc : $50. sys_name :$50.;
cards;
fac_001 term_loan RM_platform
fac_001 business_loan IQ_platform
fac_002 business_loan BUSES_termloan
fac_002 business_loan RM_platform
fac_003 overdrafts RM_platform
fac_003 RCF IQ_platform
fac_003 term_loan BUSES_termloan
;
proc contents data=swati out=contents(keep=name varnum);
run;
proc sort data=contents;
by varnum;
run;
data contents;
set contents ;
where varnum in (2,3);
run;
data contents;
set contents;
summary=catx('_',name, 'summ');
run;
data _null_;
set contents;
call symput ("name" || put(_n_ , 10. -L), name);
call symput ("summ" || put (_n_ , 10. -L), summary);
run;
options mlogic symbolgen mprint;
%macro swati;
%do i = 1 %to 2;
proc sort data=swati;
by facility_id &&name&i.;
run;
data swati1;
set swati;
by facility_id &&name&i.;
length &&summ&i. $50.;
retain &&summ&i.;
if first.facility_id then do;
&&summ&i.="";
end;
if first.&&name&i. = last.&&name&i. then &&summ&i.=catx(',',&&name&i., &&summ&i.);
else if first.&&name&i. ne last.&&name&i. then &&summ&i.=&&name&i.;
run;
if last.facility_id ;
%end;
%mend;
%swati;
This code will create two new variables loan_desc_summ and sys_name_summ which has values of the all the loans_desc in one line and the sys_names in one line seprated by comma example (term_loan, business_loan), (RM_platform, IQ_platform) But if a customer has only one loan_desc the loan_summ should only have its value twice.
The problem while running the do loop is that after running this code, I am getting the dataset with only the sys_name_summ and not the loan_desc_summ. I want the dataset with all the five variables facility_id, loan_desc, sys_name, loan_desc_summ, sys_name_summ.
Could you please help me in finding out if there is a problem in the do loop??
Your loop is always starting with the same input dataset (swati) and generating a new dataset (SWATI1). So only the last time through the loop has any effect. Each loop would need to start with the output of the previous run.
You also need to fix your logic for eliminating the duplicates.
For example you could change the macro to:
%macro swati;
data swati1;
set swati;
run;
%do i = 1 %to 2;
proc sort data=swati1;
by facility_id &&name&i.;
run;
data swati1;
set swati1;
by facility_id &&name&i ;
length &&summ&i $500 ;
if first.facility_id then &&summ&i = ' ' ;
if first.&&name&i then catx(',',&&summ&i,&&name&i);
if last.facility_id ;
run;
%end;
%mend;
Also your program could be a lot smaller if you just used arrays.
data want ;
set have ;
by facility_id ;
array one loan_desc sys_name ;
array two $500 loan_desc_summ sys_name_summ ;
retain loan_desc_summ sys_name_summ ;
do i=1 to dim(one);
if first.facility_id then two(i)=one(i) ;
else if not findw(two(i),one(i),',','t') then two(i)=catx(',',two(i),one(i));
end;
if last.facility_id;
drop i loan_desc sys_name ;
run;
If you want to make it more flexible you can put the list of variable names into a macro variable.
%let varlist=loan_desc sys_name;
You could then generate the list of new names easily.
%let varlist2=%sysfunc(tranwrd(&varlist,%str( ),_summ%str( )))_summ ;
Then you can use the macro variables in the ARRAY, RETAIN and DROP statements.

Find three most recent data year for each row

I have a data set with one row for each country and 100 columns (10 variables with 10 data years each).
For each variable I am trying to make a new data set with the three most recent data years for that variable for each country (which might not be successive).
This is what I have so far, but I know its wrong because of the nest loop, and its has same value for recent1 recent2 recent3 however I haven't figured out how to create recent1 recent2 recent3 without two loops.
%macro test();
data Maternal_care_recent;
set wb;
keep country MATERNAL_CARE_2004 -- MATERNAL_CARE_2013 recent_1 recent_2 recent_3;
%let rc = 1;
%do i = 2013 %to 2004 %by -1;
%do rc = 1 %to 3 %by 1;
%if MATERNAL_CARE_&i. ne . %then %do;
recent_&rc. = MATERNAL_CARE_&i.;
%end;
%end;
%end; run; %mend; %test();
You don't need to use a macro to do this - just some arrays:
data Maternal_care_recent;
set wb;
keep country MATERNAL_CARE_2004-MATERNAL_CARE_2013 recent_1 recent_2 recent_3;
array mc {*} MATERNAL_CARE_2004-MATERNAL_CARE_2013;
array recent {*} recent1-recent3;
do i = 2013 to 2004 by -1;
do rc = 1 to 3 by 1;
if mc[i] ne . then do;
recent[rc] = mc[i];
end;
end;
run;
Maybe I don't get your request, but according to your description:
"For each variable I am trying to make a new data set with the three most recent data years for that variable for each country (which might not be successive)" I created this sample dataset with dt1 and dt2 and 2 locations.
The output will be 2 datasets (and generally the number of the variables starting with DT) named DS1 and DS2 with 3 observations for each country, the first one for the first variable, the second one for the second variable.
This is the sample dataset:
data sample_ds;
length city $10 dt1 dt2 8.;
infile datalines dlm=',';
input city $ dt1 dt2;
datalines;
MS,5,0
MS,3,9
MS,3,9
MS,2,0
MS,1,8
MS,1,7
CA,6,1
CA,6,.
CA,6,.
CA,2,8
CA,1,5
CA,0,4
;
This is the sample macro:
%macro help(ds=);
data vars(keep=dt:); set &ds; if _n_ not >0; run;
%let op = %sysfunc(open(vars));
%let nvrs = %sysfunc(attrn(&op,nvars));
%let cl = %sysfunc(close(&op));
%do idx=1 %to &nvrs.;
proc sort data=&ds(keep=city dt&idx.) out=ds&idx.(where=(dt&idx. ne .)) nodupkey; by city DESCENDING dt&idx.; run;
data ds&idx.; set ds&idx.;
retain cnt;
by city DESCENDING dt&idx.;
if first.city then cnt=0; else cnt=cnt+1;
run;
data ds&idx.(drop=cnt); set ds&idx.(where=(cnt<3)); rename dt&idx.=act&idx.; run;
%end;
%mend;
You will run this macro with:
%help(ds=sample_ds);
In the first statement of the macro I select the variables on which I want to iterate:
data vars(keep=dt:); set &ds; if _n_ not >0; run;
Work on this if you want to make this work for your code, or simply rename your variables as DT1 DT2...
Let me know if it is correct for you.
When writing macro code, always keep in mind what has to be done when. SAS processes your code stepwise.
Before your sas code is even compiled, your macro variables are resolved and your macro code is executed
Then the resulting SAS Base code is compiled
Finally the code is executed.
When you write %if MATERNAL_CARE_&i. ne . %then %do, this is macro code interpreded before compilation.
At that time MATERNAL_CARE_&i. is not a variable but a text string containing a macro variable.
The first time you run trhough your %do i = 2013 %to 2004 by -1, it is filled in as MATERNAL_CARE_2013, the second as MATERNAL_CARE_2012., etc.
Then the macro %if statement is interpreted, and as the text string MATERNAL_CARE_1 is not equal to a dot, it is evaluated to FALSE
and recent_&rc. = MATERNAL_CARE_&i. is not included in the code to pass to your compiler.
You can see that if you run your code with option mprint;
The resolution;
options mprint;
%macro test();
data Maternal_care_recent;
set wb;
keep country MATERNAL_CARE_: recent_:;
** The : acts as a wild card here **;
%do i = 2013 %to 2004 %by -1;
if MATERNAL_CARE_&i. ne . then do;
%do rc = 1 %to 3 %by 1;
recent_&rc. = MATERNAL_CARE_&i.;
%end;
end;
%end;
run;
%mend;
%test();
Now, before compilation of if MATERNAL_CARE_&i. ne . then do, only the &i. is evalueated and if MATERNAL_CARE_2013 ne . then do is passed to the compiler.
The compiler will see this as a test if the SAS variable MATERNAL_CARE_1 has value missing, and that is just what you wanted;
Remark:
It is not essential that I moved the if statement above the ``. It is just more efficient because the condition is then evaluated less often.
It is however essential that you close your %ifs and %dos with an %end and your ifs and dos with an end;
Remark:
you do not need %let rc = 1, because %do rc = 1 to 3 already initialises &rc.;
For completeness SAS is compiled stepwise:
The next PROC or data step and its macro code are only considered when the preveous one is executed.
That is why you can write macro variables from a data step or sql select into that will influence the code you compile in your next step,
somehting you can not do for instance with C++ pre compilation;
Thanks everyone. Found a hybrid solution from a few solutions posted.
data sample_ds;
infile datalines dlm=',';
input country $ maternal_2004 maternal_2005
maternal_2006 maternal_2007 maternal_2008 maternal_2009 maternal_2010 maternal_2011 maternal_2012 maternal_2013;
datalines;
MS,5,0,5,0,5,.,5,.,5,.
MW,3,9,5,0,5,0,5,.,5,0
WE,3,9,5,0,5,.,.,.,.,0
HU,2,0,5,.,5,.,5,0,5,0
MI,1,8,5,0,5,0,5,.,5,0
HJ,1,7,5,0,5,0,.,0,.,0
CJ,6,1,5,0,5,0,5,0,5,0
CN,6,1,.,5,0,5,0,5,0,5
CE,6,5,0,5,0,.,0,5,.,8
CT,2,5,0,5,0,5,0,5,0,9
CW,1,5,0,5,0,5,.,.,0,7
CH,0,5,0,5,0,.,0,.,0,5
;
%macro test(var);
data &var._recent;
set sample_ds;
keep country &var._1 &var._2 &var._3;
array mc {*} &var._2004-&var._2013;
array recent {*} &var._1-&var._25;
count=1;
do i = 10 to 1 by -1;
if mc[i] ne . then do;
recent[count] = mc[i];
count=count+1;
end;
end;
run;
%mend;

How can I use %SCAN within a macro variable name?

I'm trying to write robust code to assign values to macro variables. I want the names of the macro variables to depend on values coming from the variable 'subgroup'. So subgroup could equal 1, 2, or 45 etc. and thus have macro variable names trta_1, trta_2, trt_45 etc.
Where I am having difficulty is calling the macro variable name. So instead of calling e.g. &trta_1 I want to call &trta_%SCAN(&subgroups, &k), which resolves to trta_1 on the first iteration. I've used a %SCAN function in the macro variable name, which is throwing up a warning 'WARNING: Apparent symbolic reference TRTA_ not resolved.'. However, the macro variables have been created with values assigned.
How can I resolve the warning? Is there a function I could run with the %SCAN function to get this to work?
data data1 ;
input subgroup trta trtb ;
datalines ;
1 30 58
2 120 450
3 670 3
run;
%LET subgroups = 1 2 3 ;
%PUT &subgroups;
%MACRO test;
%DO k=1 %TO 3;
DATA test_&k;
SET data1;
WHERE subgroup = %SCAN(&subgroups, &k);
CALL SYMPUTX("TRTA_%SCAN(&subgroups, &k)", trta, 'G');
CALL SYMPUTX("TRTB_%SCAN(&subgroups, &k)", trtb, 'G');
RUN;
%PUT "&TRTA_%SCAN(&subgroups, &k)" "&TRTB_%SCAN(&subgroups, &k)";
%END;
%MEND test;
%test;
Using the structure you've provided the following will achieve the result you're looking for.
data data1;
input subgroup trta trtb;
datalines;
1 30 58
2 120 450
3 670 3
;
run;
%LET SUBGROUPS = 1 2 3;
%PUT &SUBGROUPS;
%MACRO TEST;
%DO K=1 %TO 3;
%LET X = %SCAN(&SUBGROUPS, &K) ;
data test_&k;
set data1;
where subgroup = &X ;
call symputx(cats("TRTA_",&X), trta, 'g');
call symputx(cats("TRTB_",&X), trtb, 'g');
run;
%PUT "&&TRTA_&X" "&&TRTB_&X";
%END;
%MEND TEST;
%TEST;
However, I'm not sure this approach is particularly robust. If your list of subgroups changes you'd need to change the 'K' loop manually, you can determine the upper bound of the loop by dynamically counting the 'elements' in your subgroup list.
If you want to call the macro variables you've created later in your code, you could a similar method.
data data2;
input subgroup value;
datalines;
1 20
2 25
3 15
45 30
;
run ;
%MACRO TEST2;
%DO K=1 %TO 3;
%LET X = %SCAN(&SUBGROUPS, &K) ;
data data2 ;
set data2 ;
if subgroup = &X then percent = value/&&TRTB_&X ;
format percent percent9.2 ;
run ;
%END;
%MEND TEST2;
%TEST2 ;
Effectively, you're re-writing data2 on each iteration of the loop.
This should cover your requirements. You can load and unload an array of macro variable without a macro. I have included an alternate method of unloading a macro variable array with a macro for comparison.
Load values into macro variables including Subgroup number within macro variable name e.g. TRTA_45.
data data1;
input subgroup trta trtb;
call symput ('TRTA_'||compress (subgroup), trta);
call symput ('TRTB_'||compress (subgroup), trtb);
datalines;
1 30 58
2 120 450
3 670 3
45 999 111
;
run;
No need for macro to load or refer to macro variables.
%put TRTA_45: &TRTA_45.;
%let Subgroup_num = 45;
%put TRTB__&subgroup_num.: &&TRTB_&subgroup_num.;
If you need to loop through the macro variables then you can use Proc SQL to generate a list of subgroups.
proc sql noprint;
select subgroup
, count (*)
into :subgroups separated by ' '
, :No_Subgroups
from data1
;
quit;
%put Subgroups: &subgroups.;
%put No_Subgroups: &No_Subgroups.;
Use a macro to loop through the macro variable array and populate a table.
%macro subgroups;
data subgroup_data_macro;
%do i = 1 %to &no_subgroups.;
%PUT TRTA_%SCAN(&subgroups, &i ): %cmpres(&TRTA_%SCAN(&subgroups, &i ));
%PUT TRTB_%SCAN(&subgroups, &i ): %cmpres(&TRTB_%SCAN(&subgroups, &i ));
subgroup = %SCAN(&subgroups, &i );
TRTA = %cmpres(&TRTA_%SCAN(&subgroups, &i ));
TRTB = %cmpres(&TRTB_%SCAN(&subgroups, &i ));
output;
%end;
run;
%mend subgroups;
%subgroups;
Or use a data step (outside a macro) to loop through the macro variable array and populate a table.
data subgroup_data_sans_macro;
do i = 1 to &no_subgroups.;
subgroup = SCAN("&subgroups", i );
TRTA = input (symget (compress ('TRTA_'||subgroup)),20.);
TRTB = input (symget (compress ('TRTB_'||subgroup)),20.);
output;
end;
run;
Ensure both methods (within and without a macro) produce the same result.
proc compare
base = subgroup_data_sans_macro
compare = subgroup_data_macro
;
run;

SAS macro variable change + array index

This is related to this question: SAS macro variable change.
The code below explains the problem:
%macro test (arg=);
options mlogic mprint symbolgen;
array arraytwo [%EVAL(&arg+1)] _temporary_;
sum=0;
%do i = 1 %to %EVAL(&arg+1);
sum=sum+&i;
arraytwo[&i]=sum;
%end;
return=arraytwo[&arg+1];
%mend test;
/* This is ok */
data dat1;
%test(arg=9);
run;
data dat2;
input M;
cards;
5
6
7
;
run;
/* This give an error= A character operand was found in the %EVAL function or %IF condition where a numeric
operand is required. The condition was: M+1 */
data dat3;
set dat2;
%test(arg=M);
run;
So the question is why does it bug in the last test? Thanks.
If you happen to be using SAS 9.2 or later you might want to look at proc fcmp to create a function to do this.
If you write it as a function instead of a macro, you can pass in data set variables that would resolve to numeric values - or pass numeric values directly. For example, try this code:
proc fcmp outlib=work.funcs.simple;
function sumloop(iter);
x=1;
do i=1 to iter+1;
x+i;
end;
return(x);
endsub;
run;
/* point to the location the function was saved in */
option cmplib=work.funcs;
data _null_;
input M;
y=sumloop(M); /* data set variable */
z=sumloop(9); /* static numeric value */
put M= #7 y= #14 z= #20 ;
cards;
1
2
3
4
5
6
7
8
9
;
run;
/* My log looks like this:
14 data _null_;
15 input M;
16 y=sumloop(M); /* data set variable */
17 z=sumloop(9); /* static numeric value */
18 put M= #7 y= #14 z= #20 ;
19 cards;
M=1 y=3 z=55
M=2 y=6 z=55
M=3 y=10 z=55
M=4 y=15 z=55
M=5 y=21 z=55
M=6 y=28 z=55
M=7 y=36 z=55
M=8 y=45 z=55
M=9 y=55 z=55
*/
I have to say I'm not entirely sure what you're trying to do; but does this give you the results you're looking for? The problem with your code above is the way you are trying to combine dataset variables and macro variables-- it isn't as easy to do as one might hope...
%macro test (argList=, totNumObs=);
%local arg;
%local j;
%local i;
%do j = 1 %to &totNumObs;
%let arg = %scan(&argList, &j);
array array&j [%EVAL(&arg+1)] _temporary_;
sum = 0;
%do i = 1 %to %EVAL(&arg+1);
sum = sum+&i;
array&j[&i] = sum;
%end;
return = array&j[&arg+1];
output;
%end;
%mend test;
data dat2;
input M;
cards;
5
6
7
;
run;
proc sql noprint;
select
count (*) into :numObs
from dat2 ;
select
M into :listofMs separated by ' '
from dat2
order by M;
quit;
options mlogic mprint symbolgen;
data dat3;
%test(argList= &listofMs, totNumObs= &numObs);
run;
proc print data= dat3;
run;