I am trying to run a regression with two independent variables automatically selected (meeting a certain criterion) from a variable list. For example, my variable list is:
Var1 Var2 Var3 Var4 Var5
I am trying to run 10 regressions using the pattern:
outcomeVar = var1 var2
OutcomeVar = var1 var3
.
.
.
OutcomeVar = var2 var3
.
.
.
OutcomeVar = var4 Var5
I am trying to generate a macro that will contain a loop that will automatically build these regressions. I am trying to use the %scan function to generate this loop but cannot formulate a criteron for variable selection.
A nested loop is one option :
%MACRO COMBI ;
%LET NVAR = 5 ;
%DO X = 1 %TO %EVAL(&NVAR - 1) ;
%DO Y = %EVAL(&X + 1) %TO &NVAR ;
%LET OUTCOMEVAR = VAR&X VAR&Y ;
%PUT &OUTCOMEVAR ;
/* do something else with outcomevar */
%END ;
%END ;
%MEND ;
%COMBI ;
If your variables aren't actually numbered and sequential, you'd need to adopt a slightly different approach :
%MACRO COMBI ;
%LET VARLIST = somevar thisvar thatvar varx vary ;
%LET NVAR = %SYSFUNC(countw(&VARLIST)) ;
%DO X = 1 %TO %EVAL(&NVAR - 1) ;
%DO Y = %EVAL(&X + 1) %TO &NVAR ;
%LET OUTCOMEVAR = %SYSFUNC(scan(&VARLIST,&X)) %SYSFUNC(scan(&VARLIST,&Y)) ;
%PUT &OUTCOMEVAR ;
/* do something else with outcomevar */
%END ;
%END ;
%MEND ;
%COMBI ;
Related
I want to assign a string to a variable in a loop and write out the variable to a dataset on each iteration.
Here is the code that prints out each variable
%macro t_size(inlib=,inds=);
%let one_gig = 5000;
proc sql noprint;
select ceil((nobs*obslen)/&one_gig) into :tsize
from sashelp.vtable where libname=upcase("&inlib") and memname=upcase("&inds");
quit;
%let no_of_tables=%eval(%sysfunc(int(&tsize)));
%if (&tsize gt 1) %then
%do i = 1 %to &no_of_tables;
%put &inds._&i.;
%end;
%else
%do;
%put &inds.;
%end;
%mend;
%t_size(inlib=SASHELP,inds=SHOES);
run;
This produces the required output:
SHOES_1
SHOES_2
SHOES_3
SHOES_4
SHOES_5
SHOES_6
SHOES_7
Instead of printing the variables out to the log I want to write them to a new, empty dataset.
It appears you are attempting to split a data set FOO into N one_gig pieces FOO_1 to FOO_N. Your first step also appears to be creating the FOO target table names. Computing the split names within a DATA step will save the computed names.
Example:
%macro make_split_names(data=, out=split_names, splitsize=5000);
%local lib mem;
%let syslast = &data;
%let lib = %scan(&data,1,.);
%let mem = %scan(&data,2,.);
data parts;
ds = open ('sashelp.cars');
nobs = attrn(ds, 'NOBS');
lrecl = attrn(ds, 'LRECL');
ds = close(ds);
do n = 1 to ceil ( nobs * lrecl / &splitsize );
name = catx("_", "&mem", n);
OUTPUT;
end;
keep name;
run;
%mend;
%make_split_names (data=sashelp.cars)
If you want a dataset then replace your last block of macro logic with a data step.
data member_list ;
length memname $32 ;
if &no_of_tables > 1 then do i=1 to &no_of_tables;
memname=catx('_',"&inds",i);
output;
end;
else do;
memname="&inds";
output;
end;
keep memname;
run;
Solution:
%macro t_size(inlib=,inds=);
%let one_gig = 5000;
proc sql noprint;
select ceil((nobs*obslen)/&one_gig) into :tsize
from sashelp.vtable where libname=upcase("&inlib") and memname=upcase("&inds");
quit;
%let no_of_tables=%eval(%sysfunc(int(&tsize)));
data temp;
length temp $100;
%if (&tsize gt 1) %then
%do i = 1 %to &no_of_tables;
temp= "&inds._&i.";
output;
%end;
%else
%do;
temp= "&inds.";
output;
%end;
run;
%mend;
%t_size(inlib=SASHELP,inds=SHOES);
run;
Just add data step named temp, where input in variable temp.
Output:
+---------+
| temp |
+---------+
| SHOES_1 |
| SHOES_2 |
| SHOES_3 |
| SHOES_4 |
| SHOES_5 |
| SHOES_6 |
| SHOES_7 |
+---------+
The following macro makes an inner join between two tables containing one column from each table in addition to the joining column :
%macro ij(x=,y=,to=".default",xc=,yc=,by=);
%if &to = ".default" %then %let to = &from;
PROC SQL;
CREATE TABLE &to AS
SELECT t1.&xc, t2.&yc, t1.&by
FROM &x t1 INNER JOIN &y t2
ON t1.&by = t2.&by;
RUN;
%mend;
I want to find a way to use several columns in &xc, &yc and &by.
As I don't think I can use vectors of variables.
My idea is to pass parameters as vectors of strings instead of simple variables, for example xc = {"col1" "col2"} and loop through them
using %let some_var= %sysfunc(dequote(&some_string)); to convert them back to variables.
Applied on xc only it would become something like:
%macro ij(x=,y=,to=".default",xc=,yc=,by=);
%if &to = ".default" %then %let to = &from;
PROC SQL;
CREATE TABLE &to AS
SELECT
%do i = 1 %to %NCOL(&xc)
%let xci = %sysfunc(dequote(&xc[1]));
t1.&xci,
%end;
t2.&yc, t1.&by
FROM &x t1 INNER JOIN &y t2
ON t1.&by = t2.&by;
RUN;
%mend;
But this loop fails. How could I make it work ?
Note: this is a simplified example, my ultimate ambition is to build join macros that would be as little verbose as possible and integrate data quality checks.
Really this would be much easier to code use SAS dataset options instead of building complicated macro logic.
proc sql ;
create table want2 as
select *
from sashelp.class(keep=name age)
natural inner join sashelp.class(keep=name height weight)
;
quit;
I would suggest learning how to use data step code instead of SQL code. For most normal data manipulations it is clearer and simpler. Say you wanted to combine IN1 and IN2 on the variable ID and keep the variable A and B from IN1 and the variable X and Y from the IN2.
data out ;
merge in1 in2 ;
by id ;
keep id a b x y ;
run;
Second I would resist the urge to generate too complex a web of macro code. It will make the programs harder to understand for the next programmer. Including yourself two weeks later. Your particular example does not look like something that is worth coding as a macro. You are not really typing less information, just using a few commas in place of where your SQL code would have had keywords like FROM or JOIN.
Now to answer your actual question. To pass in a list of values to macro use a delimited list. When at all possible use space as the delimiter, but especially avoid using comma as the delimiter. This will be easier to type, easier to pass into the macro and easier to use since it matches the SAS language as you can see in the data step above. If you really need to generate code like SQL syntax that uses commas then have the macro code generate them where needed.
%macro ij
(x= /* First dataset name */
,y= /* Second dataset name */
,by= /* BY variable list */
,to= /* Output dataset name. If empty use data step to generate DATAn work name */
,xc= /* Variable list from first dataset */
,yc= /* Variable list from second dataset */
);
%if not %length(&to) %then %do;
* Let SAS generate a name for new dataset ;
data ; run;
%let to=&syslast ;
proc delete data=&to; run;
%end;
%if not %length(&xc) %then %let xc=*;
%if not %length(&yc) %then %let yx=*;
%local i sep ;
proc sql ;
create table &to as
select
%let sep= ;
%do i=1 %to %sysfunc(countw(&by)) ;
&sep.T1.%scan(&by,&i)
%let sep=,;
%end;
%do i=1 %to %sysfunc(countw(&xc)) ;
&sep.T1.%scan(&xc,&i)
%end;
%do i=1 %to %sysfunc(countw(&yc)) ;
&sep.T2.%scan(&yc,&i)
%end;
from &x T1 inner join &y T2 on
%let sep= ;
%do i=1 %to %sysfunc(countw(&by)) ;
&sep.T1.%scan(&by,&i)=T2.%scan(&by,&i)
%let sep=,;
%end;
;
quit;
%mend ij ;
Try it:
options mprint;
%ij(x=sashelp.class,y=sashelp.class,by=name,to=want,xc=age,yc=height weight);
SAS LOG:
MPRINT(IJ): proc sql ;
MPRINT(IJ): create table want as select T1.name ,T1.age ,T2.height ,T2.weight from sashelp.class
T1 inner join sashelp.class T2 on T1.name=T2.name ;
NOTE: Table WORK.WANT created, with 19 rows and 4 columns.
MPRINT(IJ): quit;
Instead of vectors, think simple lists.
Pass your variable lists as unquoted, space separated list of values. The values are SAS variable names that can be scanned out as tokens.
%macro ij (x=, ...);
...
%local i token;
%let i = 1;
%do %while (%length(%scan(&X,&i)));
%let token = %scan(&X,&i);
&token.,/* emit the token as source code */
%let i = %eval(&i+1);
%end;
...
%mend;
%ij ( x = one two three, ... )
Be sure to localize all your macro variables to prevent unwanted side effects outside the macro.
For consistency I try to use i/o related macro parameters that mimic SAS Procs -- data=, out=, file=, ...
Some would say named arguments are verbose!
If your 'proto-code' expects the xci symbol to be some sort of serially numbered variable, it is not. You would have to use %local xc&i; %let xc&i= for assignment, and &&xc&i for resolution. Also, your original code references &from which is not passed.
Building is fun. I would also recommend surveying past conference papers and SAS literature for similar works that may already meet your goal.
You could start with a space-separated list of column names and avoid looping entirely:
/*Define list of columns*/
%let COLS = A B C;
%put COLS = &COLS;
/*Add table alias prefix*/
%let REGEX = %sysfunc(prxparse(s/(\S+)/t1.$1/));
%let COLS = %sysfunc(prxchange(®EX,-1,&COLS));
%put COLS = &COLS;
%syscall prxfree(REGEX);
/*Condense multiple spaces to a single space*/
%let COLS = %sysfunc(compbl(&COLS));
%put COLS = &COLS;
/*Replace spaces with commas*/
%let COLS = %sysfunc(translate(&COLS,%str(,),%str( )));
%put COLS = &COLS;
In the end as #Tom noted, SAS dataset options are more convenient, and using them one doesn't need to loop over variables.
Here is the macro I came with :
*--------------------------------------------------------------------------------------------- ;
* JOIN ;
* Performs any join (defaults to inner join). ;
* By default left table is overwritten (convenient for successive left joins) ;
* Performs a natural join so columns should be renamed accordingly through 'rename' parameters ;
*----------------------------------------------------------------------------------------------;
%macro join
(data1= /* left table */
,data2= /* right table */
,keep1= /* columns to keep (default: keep all), don't use with drop */
,keep2=
,drop1= /* columns to drop (default: none), don't use with keep */
,drop2=
,rename1= /* rename statement, such as 'old1 = new1 old2 = new2 */
,rename2=
,j=ij /* join type, either ij lj or rj */
,out= /* created table, by default data1 (left table is overwritten)*/
);
%if not %length(&out) %then %let out = &data1;
%if %length(&keep1) %then %let keep1 = keep=&keep1;
%if %length(&keep2) %then %let keep2 = keep=&keep2;
%if %length(&drop1) %then %let drop1 = drop=&drop1;
%if %length(&drop2) %then %let drop2 = drop=&drop2;
%if %length(&rename1) %then %let rename1 = rename=(&rename1);
%if %length(&rename2) %then %let rename2 = rename=(&rename2);
%let kdr1 =;
%let kdr2 =;
%if (%length(&keep1) | %length(&drop1) | %length(&rename1)) %then %let kdr1 = (&keep1&drop1 &rename1);
%if (%length(&keep2) | %length(&drop2) | %length(&rename2)) %then %let kdr2 = (&keep2&drop2 &rename2);
%if &j=lj %then %let j = LEFT JOIN;
%if &j=ij %then %let j = INNER JOIN;
%if &j=rj %then %let j = RIGHT JOIN;
proc sql;
create table &out as select *
from &data1&kdr1 t1 natural &j &data2&kdr2 t2;
quit;
%mend;
Reproducible Examples:
data temp1;
input letter $ number1 $;
datalines;
a 1
a 2
a 3
b 4
c 8
;
data temp2;
input letter $ letter2 $ number2 $;
datalines;
a c 666
b d 0
;
* left join on common columns into new table temp3;
%join(data1=temp1,data2=temp2,j=lj,out=temp3)
* inner join by default, overwriting temp 1, after renaming to join on another column;
%join(data1=temp1,data2=temp2,drop2=letter,rename2= letter2=letter)
I've been reading up on how to DROP variables from my dataset that have null values in every observation - it seems the best way to do this is using the %DROPMISS macro function - however I'm getting an error msg - below is the code I'm trying and the error msg
Code
%DROPMISS (DSIN=dataset1, DSOUT=dataset2);
Log
4665 %DROPMISS (DSIN=dataset1, DSOUT=dataset2);
-
180
WARNING: Apparent invocation of macro DROPMISS not resolved.
ERROR 180-322: Statement is not valid or it is used out of proper order.
You need to define the dropmiss macro before you use it.
You can find it here in the appendix (page3)
http://support.sas.com/resources/papers/proceedings10/048-2010.pdf
Or better formatted here:
/******************/
options nomprint noSYMBOLGEN MLOGIC;
/****************************/
%macro DROPMISS( DSNIN /* name of input SAS dataset
*/
, DSNOUT /* name of output SAS dataset
*/
, NODROP= /* [optional] variables to be omitted from dropping even if
they have only missing values */
) ;
/* PURPOSE: To find both Character and Numeric the variables that have only
missing values and drop them if
* they are not in &NONDROP
*
* NOTE: if there are no variables in the dataset, produce no variables
processing code
*
*
* EXAMPLE OF USE:
* %DROPMISS( DSNIN, DSNOUT )
* %DROPMISS( DSNIN, DSNOUT, NODROP=A B C D--H X1-X100 )
* %DROPMISS( DSNIN, DSNOUT, NODROP=_numeric_ )
* %DROPMISS( DSNIN, DSNOUT, NOdrop=_character_ )
*/
%local I ;
%if "&DSNIN" = "&DSNOUT"
%then %do ;
%put /------------------------------------------------\ ;
%put | ERROR from DROPMISS: | ;
%put | Input Dataset has same name as Output Dataset. | ;
%put | Execution terminating forthwith. | ;
%put \------------------------------------------------/ ;
%goto L9999 ;
%end ;
/*###################################################################*/
/* begin executable code
/*####################################################################/
/*===================================================================*/
/* Create dataset of variable names that have only missing values
/* exclude from the computation all names in &NODROP
/*===================================================================*/
proc contents data=&DSNIN( drop=&NODROP ) memtype=data noprint out=_cntnts_( keep=
name type ) ; run ;
%let N_CHAR = 0 ;
%let N_NUM = 0 ;
data _null_ ;
set _cntnts_ end=lastobs nobs=nobs ;
if nobs = 0 then stop ;
n_char + ( type = 2 ) ;
n_num + ( type = 1 ) ;
/* create macro vars containing final # of char, numeric variables */
if lastobs
then do ;
call symput( 'N_CHAR', left( put( n_char, 5. ))) ;
call symput( 'N_NUM' , left( put( n_num , 5. ))) ;
end ;
run ;
/*===================================================================*/
/* if there are no variables in dataset, stop further processing
/*===================================================================*/
%if %eval( &N_NUM + &N_CHAR ) = 0
%then %do ;
%put /----------------------------------\ ;
%put | ERROR from DROPMISS: | ;
%put | No variables in dataset. | ;
%put | Execution terminating forthwith. | ;
%put \----------------------------------/ ;
%goto L9999 ;
%end ;
/*===================================================================*/
/* put global macro names into global symbol table for later retrieval
/*===================================================================*/
%LET NUM0 =0;
%LET CHAR0 = 0;
%IF &N_NUM >0 %THEN %DO;
%do I = 1 %to &N_NUM ;
%global NUM&I ;
%end ;
%END;
%if &N_CHAR > 0 %THEN %DO;
%do I = 1 %to &N_CHAR ;
%global CHAR&I ;
%end ;
%END;
/*===================================================================*/
/* create macro vars containing variable names
/* efficiency note: could compute n_char, n_num here, but must declare macro names
to be
global b4 stuffing them
/*
/*===================================================================*/
proc sql noprint ;
%if &N_CHAR > 0 %then %str( select name into :CHAR1 - :CHAR&N_CHAR from
_cntnts_ where type = 2 ; ) ;
%if &N_NUM > 0 %then %str( select name into :NUM1 - :NUM&N_NUM from
_cntnts_ where type = 1 ; ) ;
quit ;
/*===================================================================*/
/* Determine the variables that are missing
/*
/*===================================================================*/
%IF &N_CHAR > 1 %THEN %DO;
%let N_CHAR_1 = %EVAL(&N_CHAR - 1);
%END;
Proc sql ;
select %do I= 1 %to &N_NUM; max (&&NUM&I) , %end; %IF &N_CHAR > 1 %THEN %DO;
%do I= 1 %to &N_CHAR_1; max(&&CHAR&I), %END; %end; MAX(&&CHAR&N_CHAR)
into
%do I= 1 %to &N_NUM; :NUMMAX&I , %END; %IF &N_CHAR > 1 %THEN %DO;
%do I= 1 %to &N_CHAR_1; :CHARMAX&I,%END; %END; :CHARMAX&N_CHAR
from &DSNIN;
quit;
/*===================================================================*/
/* initialize DROP_NUM, DROP_CHAR global macro vars
/*===================================================================*/
%let DROP_NUM = ;
%let DROP_CHAR = ;
%if &N_NUM > 0 %THEN %DO;
DATA _NULL_;
%do I = 1 %to &N_NUM ;
%IF &&NUMMAX&I =. %THEN %DO;
%let DROP_NUM = &DROP_NUM %qtrim( &&NUM&I ) ;
%END;
%end ;
RUN;
%END;
%IF &N_CHAR > 0 %THEN %DO;
DATA _NULL_;
%do I = 1 %to &N_CHAR ;
%IF "%qtrim(&&CHARMAX&I)" eq "" %THEN %DO;
%let DROP_CHAR = &DROP_CHAR %qtrim( &&CHAR&I ) ;
%END;
%end ;
RUN;
%END;
/*===================================================================*/
/* Create output dataset
/*===================================================================*/
data &DSNOUT ;
%if &DROP_CHAR ^= %then %str(DROP &DROP_CHAR ; ) ; /* drop char variables
that
have only missing values */
%if &DROP_NUM ^= %then %str(DROP &DROP_NUM ; ) ; /* drop num variables
that
have only missing values */
set &DSNIN ;
%if &DROP_CHAR ^= or &DROP_NUM ^= %then %do;
%put /----------------------------------\ ;
%put | Variables dropped are &DROP_CHAR &DROP_NUM | ;
%put \----------------------------------/ ;
%end;
%if &DROP_CHAR = and &DROP_NUM = %then %do;
%put /----------------------------------\ ;
%put | No variables are dropped |;
%put \----------------------------------/ ;
%end;
run ;
%L9999:
%mend DROPMISS ;
I am trying to write a macro that should create multiple external html files . Here is my code
%macro createFiles;
%let name = Jupiter*Mercury*Venus;
%let htmlTxt1 = <html><h1>Hello To ;
%let htmlTxt2 = </h1></html> ;
%let i = 1 ;
%let thisName = %scan(&name., &i.,"*") ;
%do %while (&thisName. ne ) ;
filename thisFile "C:\Users\owner\Desktop\&thisName.html";
call execute ('data _null_; file &thisFile; put &htmlTxt1 || &thisName || &htmlTxt2; run; ') ;
%let i = %eval(&i + 1 ) ;
%let thisName = %scan(&name.,&i.,"*");
%end ;
%mend ;
%createFiles
However, it does not work . Please help me
Thanks
Mostly a combination of typo's and syntax errors. SAS also has the ODS HTML destination which would be easier to use to create HTML files in my opinion.
%macro createFiles;
%let name = Jupiter*Mercury*Venus;
%let htmlTxt1 = <html><h1>Hello To ;
%let htmlTxt2 = </h1></html> ;
%let i = 1 ;
%let thisName = %scan(&name., &i.,"*") ;
%do %while (&thisName. ne ) ;
filename thisFile "C:\temp\&thisName..html";
data _null_;
file thisFile;
put "&htmlTxt1 || &thisName || &htmlTxt2";
run;
%let i = %eval(&i + 1 ) ;
%let thisName = %scan(&name.,&i.,"*");
%end ;
%mend ;
%createFiles
I used to use a %do ... %to and it worked fine , but I when I tried to list all character values without %to I got a message ERROR: Expected %TO not found in %DO statement
%macro printDB2 ;
%let thisName = ;
%do &thisName = 'Test1' , 'Test2' , 'Test3' ;
proc print data=&thisName ;
run ;
%end ;
%mend printDB2 ;
I know how to complete this task using %to or %while . But I am curious is it possible to list all character values in the %do ? How can I %do this ?
If your goal here is to loop through a series of character values in some macro logic, one approach you could take is to define corresponding sequentially named macro variables and loop through those, e.g.
%let mvar1 = A;
%let mvar2 = B;
%let mvar3 = C;
%macro example;
%do i = 1 %to 3;
%put mvar&i = &&mvar&i;
%end;
%mend example;
%example;
Alternatively, you could scan a list of values repeatedly and redefine the same macro var multiple times within your loop:
%let list_of_values = A B C;
%macro example2;
%do i = 1 %to 3;
%let mvar = %scan(&list_of_values, &i, %str( ));
%put mvar = &mvar;
%end;
%mend example2;
%example2;
I've explicitly specified that I want to use space as the only list delimiter for scan - otherwise SAS uses lots default delimiters.