Iterate through vector and convert elements to unquoted variable names - sas

The following macro makes an inner join between two tables containing one column from each table in addition to the joining column :
%macro ij(x=,y=,to=".default",xc=,yc=,by=);
%if &to = ".default" %then %let to = &from;
PROC SQL;
CREATE TABLE &to AS
SELECT t1.&xc, t2.&yc, t1.&by
FROM &x t1 INNER JOIN &y t2
ON t1.&by = t2.&by;
RUN;
%mend;
I want to find a way to use several columns in &xc, &yc and &by.
As I don't think I can use vectors of variables.
My idea is to pass parameters as vectors of strings instead of simple variables, for example xc = {"col1" "col2"} and loop through them
using %let some_var= %sysfunc(dequote(&some_string)); to convert them back to variables.
Applied on xc only it would become something like:
%macro ij(x=,y=,to=".default",xc=,yc=,by=);
%if &to = ".default" %then %let to = &from;
PROC SQL;
CREATE TABLE &to AS
SELECT
%do i = 1 %to %NCOL(&xc)
%let xci = %sysfunc(dequote(&xc[1]));
t1.&xci,
%end;
t2.&yc, t1.&by
FROM &x t1 INNER JOIN &y t2
ON t1.&by = t2.&by;
RUN;
%mend;
But this loop fails. How could I make it work ?
Note: this is a simplified example, my ultimate ambition is to build join macros that would be as little verbose as possible and integrate data quality checks.

Really this would be much easier to code use SAS dataset options instead of building complicated macro logic.
proc sql ;
create table want2 as
select *
from sashelp.class(keep=name age)
natural inner join sashelp.class(keep=name height weight)
;
quit;
I would suggest learning how to use data step code instead of SQL code. For most normal data manipulations it is clearer and simpler. Say you wanted to combine IN1 and IN2 on the variable ID and keep the variable A and B from IN1 and the variable X and Y from the IN2.
data out ;
merge in1 in2 ;
by id ;
keep id a b x y ;
run;
Second I would resist the urge to generate too complex a web of macro code. It will make the programs harder to understand for the next programmer. Including yourself two weeks later. Your particular example does not look like something that is worth coding as a macro. You are not really typing less information, just using a few commas in place of where your SQL code would have had keywords like FROM or JOIN.
Now to answer your actual question. To pass in a list of values to macro use a delimited list. When at all possible use space as the delimiter, but especially avoid using comma as the delimiter. This will be easier to type, easier to pass into the macro and easier to use since it matches the SAS language as you can see in the data step above. If you really need to generate code like SQL syntax that uses commas then have the macro code generate them where needed.
%macro ij
(x= /* First dataset name */
,y= /* Second dataset name */
,by= /* BY variable list */
,to= /* Output dataset name. If empty use data step to generate DATAn work name */
,xc= /* Variable list from first dataset */
,yc= /* Variable list from second dataset */
);
%if not %length(&to) %then %do;
* Let SAS generate a name for new dataset ;
data ; run;
%let to=&syslast ;
proc delete data=&to; run;
%end;
%if not %length(&xc) %then %let xc=*;
%if not %length(&yc) %then %let yx=*;
%local i sep ;
proc sql ;
create table &to as
select
%let sep= ;
%do i=1 %to %sysfunc(countw(&by)) ;
&sep.T1.%scan(&by,&i)
%let sep=,;
%end;
%do i=1 %to %sysfunc(countw(&xc)) ;
&sep.T1.%scan(&xc,&i)
%end;
%do i=1 %to %sysfunc(countw(&yc)) ;
&sep.T2.%scan(&yc,&i)
%end;
from &x T1 inner join &y T2 on
%let sep= ;
%do i=1 %to %sysfunc(countw(&by)) ;
&sep.T1.%scan(&by,&i)=T2.%scan(&by,&i)
%let sep=,;
%end;
;
quit;
%mend ij ;
Try it:
options mprint;
%ij(x=sashelp.class,y=sashelp.class,by=name,to=want,xc=age,yc=height weight);
SAS LOG:
MPRINT(IJ): proc sql ;
MPRINT(IJ): create table want as select T1.name ,T1.age ,T2.height ,T2.weight from sashelp.class
T1 inner join sashelp.class T2 on T1.name=T2.name ;
NOTE: Table WORK.WANT created, with 19 rows and 4 columns.
MPRINT(IJ): quit;

Instead of vectors, think simple lists.
Pass your variable lists as unquoted, space separated list of values. The values are SAS variable names that can be scanned out as tokens.
%macro ij (x=, ...);
...
%local i token;
%let i = 1;
%do %while (%length(%scan(&X,&i)));
%let token = %scan(&X,&i);
&token.,/* emit the token as source code */
%let i = %eval(&i+1);
%end;
...
%mend;
%ij ( x = one two three, ... )
Be sure to localize all your macro variables to prevent unwanted side effects outside the macro.
For consistency I try to use i/o related macro parameters that mimic SAS Procs -- data=, out=, file=, ...
Some would say named arguments are verbose!
If your 'proto-code' expects the xci symbol to be some sort of serially numbered variable, it is not. You would have to use %local xc&i; %let xc&i= for assignment, and &&xc&i for resolution. Also, your original code references &from which is not passed.
Building is fun. I would also recommend surveying past conference papers and SAS literature for similar works that may already meet your goal.

You could start with a space-separated list of column names and avoid looping entirely:
/*Define list of columns*/
%let COLS = A B C;
%put COLS = &COLS;
/*Add table alias prefix*/
%let REGEX = %sysfunc(prxparse(s/(\S+)/t1.$1/));
%let COLS = %sysfunc(prxchange(&REGEX,-1,&COLS));
%put COLS = &COLS;
%syscall prxfree(REGEX);
/*Condense multiple spaces to a single space*/
%let COLS = %sysfunc(compbl(&COLS));
%put COLS = &COLS;
/*Replace spaces with commas*/
%let COLS = %sysfunc(translate(&COLS,%str(,),%str( )));
%put COLS = &COLS;

In the end as #Tom noted, SAS dataset options are more convenient, and using them one doesn't need to loop over variables.
Here is the macro I came with :
*--------------------------------------------------------------------------------------------- ;
* JOIN ;
* Performs any join (defaults to inner join). ;
* By default left table is overwritten (convenient for successive left joins) ;
* Performs a natural join so columns should be renamed accordingly through 'rename' parameters ;
*----------------------------------------------------------------------------------------------;
%macro join
(data1= /* left table */
,data2= /* right table */
,keep1= /* columns to keep (default: keep all), don't use with drop */
,keep2=
,drop1= /* columns to drop (default: none), don't use with keep */
,drop2=
,rename1= /* rename statement, such as 'old1 = new1 old2 = new2 */
,rename2=
,j=ij /* join type, either ij lj or rj */
,out= /* created table, by default data1 (left table is overwritten)*/
);
%if not %length(&out) %then %let out = &data1;
%if %length(&keep1) %then %let keep1 = keep=&keep1;
%if %length(&keep2) %then %let keep2 = keep=&keep2;
%if %length(&drop1) %then %let drop1 = drop=&drop1;
%if %length(&drop2) %then %let drop2 = drop=&drop2;
%if %length(&rename1) %then %let rename1 = rename=(&rename1);
%if %length(&rename2) %then %let rename2 = rename=(&rename2);
%let kdr1 =;
%let kdr2 =;
%if (%length(&keep1) | %length(&drop1) | %length(&rename1)) %then %let kdr1 = (&keep1&drop1 &rename1);
%if (%length(&keep2) | %length(&drop2) | %length(&rename2)) %then %let kdr2 = (&keep2&drop2 &rename2);
%if &j=lj %then %let j = LEFT JOIN;
%if &j=ij %then %let j = INNER JOIN;
%if &j=rj %then %let j = RIGHT JOIN;
proc sql;
create table &out as select *
from &data1&kdr1 t1 natural &j &data2&kdr2 t2;
quit;
%mend;
Reproducible Examples:
data temp1;
input letter $ number1 $;
datalines;
a 1
a 2
a 3
b 4
c 8
;
data temp2;
input letter $ letter2 $ number2 $;
datalines;
a c 666
b d 0
;
* left join on common columns into new table temp3;
%join(data1=temp1,data2=temp2,j=lj,out=temp3)
* inner join by default, overwriting temp 1, after renaming to join on another column;
%join(data1=temp1,data2=temp2,drop2=letter,rename2= letter2=letter)

Related

SAS LOOP MACRO LIST

Do you know how can I iterate on a list using SAS and a MACRO ?
%LET table = item1 item2;/*List of all input*/
/*I try to iterate on the list using a macro*/
%MACRO Main_Extract ;
array orig[*] &table;
do i=1 to dim(orig);
%put orig[i];
end;
%MEND;
%Main_Extract;
If table, the list of items, is of variable names for an array, then you do not need a macro. Just use plain data step code and use the macro variable to list the array elements.
array orig &table;
do I = 1 to dim(orig);
put orig[I]=
end;
When a macro variable contains a space separated list of items, the use of such inside of a macro is usually done by parsing out each item using %scan inside a %do loop. An example of when this is useful would be generating a series of select clauses for a Proc SQL statement.
One time use of parsing out each item
%macro special_sauce (items=);
%local i item;
%let i = 1;
%do %while (%length(%scan(&items,&i)));
%let item = %scan(&items,&i);
%put NOTE: code generated for &=item;
/* ... emit some SAS code or code-snippet involving &item ... */
&item._squared = &item ** 2; /* emit data step source statement that presumes item is a variable name that is being squared */
%let i = %eval(&i+1);
%end;
%mend;
options mprint;
data want;
set sashelp.class;
%special_sauce(items=age height)
run;
If the list of items is needed to be used more than once it is also helpful to store the individual items in local macro variables for easy re-use.
List of items used more than once, parse once and put items in a 'macro-array'. There is really no such thing as a macro-array, simply a convention of numerically suffixed symbol names that can be iterated over.
%macro special_sauce2 (items=);
%local i item itemCount;
%let i = 1;
%do %while (%length(%scan(&items,&i)));
%let item = %scan(&items,&i);
%let itemCount = &i; /* track number of items parsed */
%local item&i; /* local macro variable name with numeric suffix, sometimes called a macro array */
%let item&i = &item; /* save the extracted item */
%let i = %eval(&i+1);
%end;
/* use the items in the 'macro-array' */
%do i = 1 %to &itemCount;
%put NOTE: value of macro variable item&i is &&item&i;
&&item&i.._2 = &&item&i ** 2;
%end;
/* use the items in the 'macro-array' */
%do i = 1 %to &itemCount;
%put NOTE: Hi again, value of macro variable item&i is &&item&i;
&&item&i.._3 = &&item&i ** 3;
%end;
%mend;
options mprint;
data want;
set sashelp.class;
%special_sauce2(items=age height)
run;
Good rule of thumb, don't macro if you don't have to.

Loop in proc sql select

I need to break one vaiable into multiple variables (max length 2000).
For example my string has length 10000000 (10 mb) I use:
proc sql;
create table str as
select
substr(string,2000,1) as field1,
substr(string,2000,2001) as field2,
.......
from data_table
Could I write loop in select statement not to write these field1-field5000.
Thank you!
First
substr() function takes 3 arguments
substr(string, position <, length>)
string - string constatn or field
position - starting position
length - length of the string you want to return
Second
In proc sql you can only use macro language loops so you must write macroprogram.
options mprint;
%macro substrLoop;
%let length = 2000;
%let endLoop = %eval(1000000/&length.);
proc sql;
create table str as
select
%do i = 1 %to &endLoop.;
substr(string, %eval(1 + (&i.-1)*&length.),&length.) as field&i.
%if &i ne &endLoop. %then ,;
%end;
from data_table;
quit;
%mend substrLoop;
%substrLoop
Explanation
options mprint;
enables to see in log code that was generated by called macro
%let length = 2000;
%let endLoop = %eval(1000000/&length.);
Setting macarovariables for length of substring and calculating when loop should end.
%do i = 1 %to &endLoop.;
substr(string, %eval(1 + (&i.-1)*&length.),&length.) as field&i.
%if &i ne &endLoop. %then ,;
%end;
Actual loop puting substr(string, 1,2000) as field1 , substr(string, 2001,2000) as field2 , etc. calculated fields into sql code.
%if &i ne &endLoop. %then ,; is needed to prevent puting comma after last generated field.

How to scan a numeric variable

I have a table like this:
Lista_ID 1 4 7 10 ...
in total there are 100 numbers.
I want to call each one of these numbers to a macro i created. I was trying to use 'scan' but read that it's just for character variables.
the error when i runned the following code was
there's the code:
proc sql;
select ID INTO: LISTA_ID SEPARATED BY '*' from
WORK.AMOSTRA;
run;
PROC SQL;
SELECT COUNT(*) INTO: NR SEPARATED BY '*' FROM
WORK.AMOSTRA;
RUN;
%MACRO CICLO_teste();
%LET LIM_MSISDN = %EVAL(NR);
%LET I = %EVAL(1);
%DO %WHILE (&I<= &LIM_MSISDN);
%LET REF = %SCAN(LISTA_ID,&I,,'*');
DATA WORK.UP&REF;
SET WORK.BASE&REF;
FORMAT PERC_ACUM 9.3;
IF FIRST.ID_CLIENTE THEN PERC_ACUM=0;
PERC_ACUM+PERC;
RUN;
%LET I = %EVAL(&I+1);
%END;
%MEND;
%CICLO_TESTE;
the error was that:
VARIABLE PERC IS UNITIALIZED and
VARIABLE FIRST.ID_CLIENTE IS UNITIALIZED.
What I want is to run this macro for each one of the Id's in the List I showed before, and that are referenced in work.base&ref and work.up&ref.
How can I do it? What I'm doing wrong?
thanks!
Here's the CALL EXECUTE version.
%MACRO CICLO_teste(REF);
DATA WORK.UP&REF;
SET WORK.BASE&REF;
BY ID_CLIENTE;
FORMAT PERC_ACUM 9.3;
IF FIRST.ID_CLIENTE THEN PERC_ACUM=0;
PERC_ACUM+PERC;
RUN;
%CICLO_TESTE;
DATA _NULL_;
SET amostra;
*CREATE YOUR MACRO CALL;
STR = CATT('%CLIO_TESTE(', ID, ')');
CALL EXECUTE(STR);
RUN;
First you should note that SAS macro variable resolve is intrinsically a "text-based" copy-paste action. That is, all the user-defined macro variables are texts. Therefore, %eval is unnecessary in this case.
Other miscellaneous corrections include:
Check the %scan() function for correct usage. The first argument should be a text string WITHOUT QUOTES.
run is redundant in proc sql since each sql statement is run as soon as they are sent. Use quit; to exit proc sql.
A semicolon is not required for macro call (causes unexpected problems sometimes).
use %do %to for loops
The code below should work.
data work.amostra;
input id;
cards;
1
4
7
10
;
run;
proc sql noprint;
select id into :lista_id separated by ' ' from work.amostra;
select count(*) into :nr separated by ' ' from work.amostra;
quit;
* check;
%put lista_id=&lista_id nr=&nr;
%macro ciclo_teste();
%local ref;
%do i = 1 %to &nr;
%let ref = %scan(&lista_id, &i);
%*check;
%put ref = &ref;
/* your task below */
/* data work.up&ref;*/
/* set work.base&ref;*/
/* format perc_acum 9.3;*/
/* if first.id_cliente then perc_acum=0;*/
/* perc_acum + perc;*/
/* run; */
%end;
%mend;
%ciclo_teste()
tested on SAS 9.4 win7 x64
Edited:
In fact I would recommend doing this to avoid scanning a long string which is inefficient.
%macro tester();
/* get the number of obs (a more efficient way) */
%local NN;
proc sql noprint;
select nobs into :NN
from dictionary.tables
where upcase(libname) = 'WORK'
and upcase(memname) = 'AMOSTRA';
quit;
/* assign &ref by random access */
%do i = 1 %to &NN;
data _null_;
a = &i;
set work.amostra point=a;
call symputx('ref',id,'L');
stop;
run;
%*check;
%put ref = &ref;
/* your task below */
%end;
%mend;
%tester()
Please let me know if you have further questions.
Wow that seems like a lot of work. Why not just do the following:
data work.amostra;
input id;
cards;
1
4
7
10
;
run;
%macro test001;
proc sql noprint;
select count(*) into: cnt
from amostra;
quit;
%let cnt = &cnt;
proc sql noprint;
select id into: x1 - :x&cnt
from amostra;
quit;
%do i = 1 %to &cnt;
%let x&i = &&x&i;
%put &&x&i;
%end;
%mend test001;
%test001;
now in variables &x1 - &&x&cnt you have your values and you can process them however you like.
In general if your list is small enough (macro variables are limited to 64K characters) then you are better off passing the list in a single delimited macro variable instead of multiple macro variables.Remember that PROC SQL will automatically set the count into the macro variable SQLOBS so there is no need to run the query twice. Or you can use %sysfunc(countw()) to count the number of entries in your delimited list.
proc sql noprint ;
select id into :idlist separated by '|' from .... ;
%let nr=&sqlobs;
quit;
...
%do i=1 %to &nr ;
%let id=%scan(&idlist,&i,|);
data up&id ;
...
%end;
If you do generate multiple macro variables there is no need to set the upper bound in advance as SAS will only create the number of macro variables it needs based on the number of observations returned by the query.
select id into :idval1 - from ... ;
%let nr=&sqlobs;
If you are using an older version of SAS the you need set an upper bound on the macro variable range.
select id into :idval1 - :idval99999 from ... ;

Skip block of codes if macro variable is empty

I am creating a macro variable with the SAS code below. It's storing a list of data names where I need to replace certain values in specific variables.
proc sql noprint;
select distinct data_name
into :data_repl separated by ' '
from TP_attribute_matching
where Country="&Country_Name" and Replace_this ne ' ';
quit;
I would like to skip the following 2 blocks if data_repl is empty. These 2 blocks go through each data set and variables in that data set, and then replaces x with y.
/*Block 1*/
%do i=1 %to %_count_(word=&data_repl);
proc sql noprint;
select var_name,
Replace_this,
Replace_with
into :var_list_repl_&i. separated by ' ',
:repl_this_list_&i. separated by '#',
:repl_with_list_&i. separated by '#'
from TP_attribute_matching
where Replace_this ne ' ' and data_name="%scan(&data_repl,&i.)";
quit;
/* Block 2 */
%do i=1 %to %_count_(word=&data_repl);
data sasdata.%scan(&data_repl,&i);
set sasdata.%scan(&data_repl,&i);
%do j=1 %to %_count_(word=&&var_list_repl_&i.);
%let from=%scan("&&repl_this_list_&i.",&j,'#');
%let to=%scan("&&repl_with_list_&i.",&j,'#');
%scan(&&var_list_repl_&i.,&j)=translate(%scan(&&var_list_repl_&i.,&j),&to,&from);
%end;
run;
%end;
How shoould I do this? I was going through %SKIP and if then leave, but cannot figure this out yet.
%IF and %DO are macro statements that can only be used inside a macro:
%macro DoSomething;
%if "&data_repl" ne "" %then %do;
/*Block 1*/
%do i=1 %to %_count_(word=&data_repl);
proc sql noprint;
select var_name,
Replace_this,
Replace_with
into :var_list_repl_&i. separated by ' ',
:repl_this_list_&i. separated by '#',
:repl_with_list_&i. separated by '#'
from TP_attribute_matching
where Replace_this ne ' ' and data_name="%scan(&data_repl,&i.)";
quit;
/* Block 2 */
%do i=1 %to %_count_(word=&data_repl);
data sasdata.%scan(&data_repl,&i);
set sasdata.%scan(&data_repl,&i);
%do j=1 %to %_count_(word=&&var_list_repl_&i.);
%let from=%scan("&&repl_this_list_&i.",&j,'#');
%let to=%scan("&&repl_with_list_&i.",&j,'#');
%scan(&&var_list_repl_&i.,&j)=translate(%scan(&&var_list_repl_&i.,&j),&to,&from);
%end;
run;
%end;
%end;
%mend;
%DoSomething
EDIT:
Instead of checking the string, you can use count from PROC SQL (&SQLOBS macro var)
%let SQLOBS=0; /* reset SQLOBS */
%let data_repl=; /* initialize data_repl,
would not be defined in case when no rows returned */
proc sql noprint;
select distinct data_name
into :data_repl separated by ' '
from TP_attribute_matching
where Country="&Country_Name" and Replace_this ne ' '
and not missing(data_name);
quit;
%let my_count = &SQLOBS; /* keep the record count from last PROC SQL */
...
%if &my_count gt 0 %then %do;
...
...
%end;
If you already have a main macro, no need to define new (I'm not sure what you're asking now).
First off, this is yet another good example where list processing basics would simplify the code to where you don't need to worry about your actual question. Will elaborate later.
Second off, the way these loops are usually coded is something like
%do ... %while &macrovar ne ;
which checks for empty and doesn't execute the loop at all if it's empty to start with. &macrovar there would be the result of the scan. IE:
%let scan_result = %scan(&Data_repl.,1);
%do i = 1 %to %_count_... while &scan_result ne ; *perhaps minus one, not sure what %_count_() does exactly;
... code
%let scan_result=%scan(&data_Repl.,&i+1);
%end;
Going back to list processing, what you're ultimately doing is:
data &dataset.;
set &dataset.;
[for some set of &variables,&tos, &froms]
&variable. = translate(&variable.,&to.,&from.);
[/set of variables]
run;
So what you need is a couple of macros. Assuming you have a dataset with
<dataset> <varname> <to> <from>
You can call this pretty easily. Two ways:
Run it as a set of nested macros/calls. This is a bit messier, but might be a bit easier to understand.
%macro do_dataset(data=);
proc sql noprint;
select cats('%convert_Var(var=',varname,',to=',to,',from=',from,')')
into :convertlist separated by ' '
from dataset_with_conversions
where dataset="&data.";
quit;
data &data;
set &data;
&convertlist.;
run;
%mend do_dataset;
%macro convert_var(var=,to=,from=);
&var. = translate(&var.,"&to.","&from.");
%mend convert_var;
proc sql noprint;
select cats('%do_dataset(data=',dataset,')')
into :dslist separated by ' '
from dataset_with_conversions;
quit;
&dslist;
Second, you can do all of that in one datastep using call execute (rather than having two different steps). IE, do a by dataset statement, then for first.dataset execute data <dataset>; (filling in that) and for last.dataset execute run, and otherwise execute the translates.
More complicated, but one pass solution - depends on your comfort level which you prefer, they should generally work similarly.
if you want to skip something based on the parameter, if data_repl is set as null, you can add a check for the value, it will avoid error causing during the include statement, since at that time this will be null and which may cause error. E.g
if libary path is derived based on variable passed. which will lead to invalid library path during the include statement, We can use the skip statement.
%macro DoSomething(data_repl=);
%if "&data_repl" ne "" %then %do;
// your code goes here.
%end;
%mend;
%DoSomething

SAS Is it possible not to use "TO" in a do loop in MACRO?

I used to use a %do ... %to and it worked fine , but I when I tried to list all character values without %to I got a message ERROR: Expected %TO not found in %DO statement
%macro printDB2 ;
%let thisName = ;
%do &thisName = 'Test1' , 'Test2' , 'Test3' ;
proc print data=&thisName ;
run ;
%end ;
%mend printDB2 ;
I know how to complete this task using %to or %while . But I am curious is it possible to list all character values in the %do ? How can I %do this ?
If your goal here is to loop through a series of character values in some macro logic, one approach you could take is to define corresponding sequentially named macro variables and loop through those, e.g.
%let mvar1 = A;
%let mvar2 = B;
%let mvar3 = C;
%macro example;
%do i = 1 %to 3;
%put mvar&i = &&mvar&i;
%end;
%mend example;
%example;
Alternatively, you could scan a list of values repeatedly and redefine the same macro var multiple times within your loop:
%let list_of_values = A B C;
%macro example2;
%do i = 1 %to 3;
%let mvar = %scan(&list_of_values, &i, %str( ));
%put mvar = &mvar;
%end;
%mend example2;
%example2;
I've explicitly specified that I want to use space as the only list delimiter for scan - otherwise SAS uses lots default delimiters.