In the below code, I'm using macro variables in then statement, however any variation of code seems to be failing in one or the other.
%MACRO LOOP_I;
DATA JAV_WORK2;
set WORK.JAV_WORK1;
%do i = 1 %to 24 %by 1;
%IF FF in ('CV', 'CV1', 'CV2', 'CVA', 'CVP', 'HAS') and S_TYPE in ('ETR_CARD', 'ETR_PCP', 'ETR_TRX') %THEN MONTH&i_SALES=trx&i;
%ELSE %IF FF = 'HAS' and LENGTH(GEO_ID) = 6 and S_TYPE = ('ETR_DDD') %THEN MONTH&i_SALES= UNIT&i;
%end;
RUN;
%MEND LOOP_I;
%LOOP_I
When I tried removing % from IF statements, then I was receiving "ERROR 180-322: Statement is not valid or it is used out of proper order". TIA
You can achieve the same results using arrays instead of macro loops.
data jav_work2;
set jav_work1;
array sales_month[24];
array trx[24];
array unit[24];
if(FF IN('CV', 'CV1', 'CV2', 'CVA', 'CVP', 'HAS')
AND S_TYPE in ('ETR_CARD', 'ETR_PCP', 'ETR_TRX')
then do;
do i = 1 to 24;
sales_month[i] = trx[i];
end;
else if(FF = 'HAS' AND length(GEO_ID) = 6 AND S_TYPE = 'ETR_DDD') then do;
do i = 1 to 24
sales_month[i] = unit[i];
end;
end;
end;
run;
If you wanted to use the same naming convention, you can enclose it within a macro and modify the sales_month[24] array statement with this:
array sales_month[24] %do i = 1 %to 24; month&i._sales %end; ;
When developing ETL programs with SAS Data Integration (DI) Studio, each transformation you specify has a neath user interface to specify the columns of the datasets(=tables) you create and their type, length and format.
When the existing transformations can not do the job and jou need user written code, you should do as much as possible with the point-and-click interface because
it enables impact analysis and reverse impact analysis
your successors will probably not dive into the SAS base code unless they realy need to
if they ever need to change the columns or their format, they expect changing the specifications in the user interface will do the job
Fortunately SAS discloses the format of the output to the programmer of a user written transformation in macro variables
So I often write my code inside the user written transformation as
%macro format_output(data);
%do _col = 0 %to &&&data._col_count - 1;
length &&&data._col&_col._name &&&data._col&_col._type&&&data._col&_col._length;
%if %length(&&&data._col&_col._format) %then %do;
format &&&data._col&_col._name &&&data._col&_col._format;
%end;
%end;
%mend;
data &_OUTPUT1 (keep=&_OUTPUT1_keep);
%format_output(_OUTPUT1);
set &_INPUT1;
* actua code ;
run;
Does SAS supply an embedded macro or something to do the same thing?
For completenes: the way SAS discloses the structure of the output looks like
%let _OUTPUT1 = myLib.myTable;
%let _OUTPUT1_connect = ;
%let _OUTPUT1_engine = ;
%let _OUTPUT1_memtype = DATA;
%let _OUTPUT1_options = %nrquote();
%let _OUTPUT1_alter = %nrquote();
%let _OUTPUT1_path = %nrquote(/myTable_A5E5JYT6.C7000MOL%(WorkTable%));
%let _OUTPUT1_type = 1;
%let _OUTPUT1_label = %nrquote();
/* List of target columns to keep */
%let _OUTPUT1_keep = myID first_item <16 more items> last_item;
%let _OUTPUT1_col_count = 19;
%let _OUTPUT1_col0_name = myID;
%let _OUTPUT1_col0_table = myLib.myTable;
%let _OUTPUT1_col0_length = 8;
%let _OUTPUT1_col0_type = ;
%let _OUTPUT1_col0_format = 13.;
%let _OUTPUT1_col0_informat = 13.;
%let _OUTPUT1_col0_label = %nrquote(my ID);
%let _OUTPUT1_col0_input0 = myID;
%let _OUTPUT1_col0_exp = ;
%let _OUTPUT1_col0_input = myID;
%let _OUTPUT1_col0_input_count = 1;
%let _OUTPUT1_col1_name = first_item;
%let _OUTPUT1_col1_table = myLib.myTable;
%let _OUTPUT1_col1_length = 8;
%let _OUTPUT1_col1_type = $;
%let _OUTPUT1_col1_format = $CHAR8.;
%let _OUTPUT1_col1_informat = $CHAR8.;
%let _OUTPUT1_col1_label = %nrquote(first data item in my table);
%let _OUTPUT1_col1_input0 = first_item;
%let _OUTPUT1_col1_exp = ;
%let _OUTPUT1_col1_input = first_item;
%let _OUTPUT1_col1_input_count = 1;
<documentation about 16 more columns>
%let _OUTPUT1_col18_name = last_item;
%let _OUTPUT1_col18_table = myLib.myTable;
%let _OUTPUT1_col18_length = 16;
%let _OUTPUT1_col18_type = $;
%let _OUTPUT1_col18_format = $16.;
%let _OUTPUT1_col18_informat = $16.;
%let _OUTPUT1_col18_label = %nrquote(last data item in my table);
%let _OUTPUT1_col18_input0 = last_item;
%let _OUTPUT1_col18_exp = ;
%let _OUTPUT1_col18_input = last_item;
%let _OUTPUT1_col18_input_count = 1;
%let _OUTPUT1_filetype = WorkTable;
In SAS DI Studio, The user written nodes can have more than one output table visualy visible & accessible from the node icon, and by right clicking on the table you can see change the table details (table name, type, length, format, informat, .. etc ). Please note I am using version 4.6
In my Screenshot below my user written node already has three output table. to add a forth one:
Right Click on the node
select "Add Work Table"
A new window will open for you to enter the table detail
In the DI Auto generated code for your node, sas will assign _OUTPUT1,2,3,4 to your tables, so in your code you can reference them with their corresponding &Output
Once you added your tables, you can see and edit all the output table details (length, format, informat) by selecting their properties from the visual interface (point and click). So you wont need a macro for that.
The following macro makes an inner join between two tables containing one column from each table in addition to the joining column :
%macro ij(x=,y=,to=".default",xc=,yc=,by=);
%if &to = ".default" %then %let to = &from;
PROC SQL;
CREATE TABLE &to AS
SELECT t1.&xc, t2.&yc, t1.&by
FROM &x t1 INNER JOIN &y t2
ON t1.&by = t2.&by;
RUN;
%mend;
I want to find a way to use several columns in &xc, &yc and &by.
As I don't think I can use vectors of variables.
My idea is to pass parameters as vectors of strings instead of simple variables, for example xc = {"col1" "col2"} and loop through them
using %let some_var= %sysfunc(dequote(&some_string)); to convert them back to variables.
Applied on xc only it would become something like:
%macro ij(x=,y=,to=".default",xc=,yc=,by=);
%if &to = ".default" %then %let to = &from;
PROC SQL;
CREATE TABLE &to AS
SELECT
%do i = 1 %to %NCOL(&xc)
%let xci = %sysfunc(dequote(&xc[1]));
t1.&xci,
%end;
t2.&yc, t1.&by
FROM &x t1 INNER JOIN &y t2
ON t1.&by = t2.&by;
RUN;
%mend;
But this loop fails. How could I make it work ?
Note: this is a simplified example, my ultimate ambition is to build join macros that would be as little verbose as possible and integrate data quality checks.
Really this would be much easier to code use SAS dataset options instead of building complicated macro logic.
proc sql ;
create table want2 as
select *
from sashelp.class(keep=name age)
natural inner join sashelp.class(keep=name height weight)
;
quit;
I would suggest learning how to use data step code instead of SQL code. For most normal data manipulations it is clearer and simpler. Say you wanted to combine IN1 and IN2 on the variable ID and keep the variable A and B from IN1 and the variable X and Y from the IN2.
data out ;
merge in1 in2 ;
by id ;
keep id a b x y ;
run;
Second I would resist the urge to generate too complex a web of macro code. It will make the programs harder to understand for the next programmer. Including yourself two weeks later. Your particular example does not look like something that is worth coding as a macro. You are not really typing less information, just using a few commas in place of where your SQL code would have had keywords like FROM or JOIN.
Now to answer your actual question. To pass in a list of values to macro use a delimited list. When at all possible use space as the delimiter, but especially avoid using comma as the delimiter. This will be easier to type, easier to pass into the macro and easier to use since it matches the SAS language as you can see in the data step above. If you really need to generate code like SQL syntax that uses commas then have the macro code generate them where needed.
%macro ij
(x= /* First dataset name */
,y= /* Second dataset name */
,by= /* BY variable list */
,to= /* Output dataset name. If empty use data step to generate DATAn work name */
,xc= /* Variable list from first dataset */
,yc= /* Variable list from second dataset */
);
%if not %length(&to) %then %do;
* Let SAS generate a name for new dataset ;
data ; run;
%let to=&syslast ;
proc delete data=&to; run;
%end;
%if not %length(&xc) %then %let xc=*;
%if not %length(&yc) %then %let yx=*;
%local i sep ;
proc sql ;
create table &to as
select
%let sep= ;
%do i=1 %to %sysfunc(countw(&by)) ;
&sep.T1.%scan(&by,&i)
%let sep=,;
%end;
%do i=1 %to %sysfunc(countw(&xc)) ;
&sep.T1.%scan(&xc,&i)
%end;
%do i=1 %to %sysfunc(countw(&yc)) ;
&sep.T2.%scan(&yc,&i)
%end;
from &x T1 inner join &y T2 on
%let sep= ;
%do i=1 %to %sysfunc(countw(&by)) ;
&sep.T1.%scan(&by,&i)=T2.%scan(&by,&i)
%let sep=,;
%end;
;
quit;
%mend ij ;
Try it:
options mprint;
%ij(x=sashelp.class,y=sashelp.class,by=name,to=want,xc=age,yc=height weight);
SAS LOG:
MPRINT(IJ): proc sql ;
MPRINT(IJ): create table want as select T1.name ,T1.age ,T2.height ,T2.weight from sashelp.class
T1 inner join sashelp.class T2 on T1.name=T2.name ;
NOTE: Table WORK.WANT created, with 19 rows and 4 columns.
MPRINT(IJ): quit;
Instead of vectors, think simple lists.
Pass your variable lists as unquoted, space separated list of values. The values are SAS variable names that can be scanned out as tokens.
%macro ij (x=, ...);
...
%local i token;
%let i = 1;
%do %while (%length(%scan(&X,&i)));
%let token = %scan(&X,&i);
&token.,/* emit the token as source code */
%let i = %eval(&i+1);
%end;
...
%mend;
%ij ( x = one two three, ... )
Be sure to localize all your macro variables to prevent unwanted side effects outside the macro.
For consistency I try to use i/o related macro parameters that mimic SAS Procs -- data=, out=, file=, ...
Some would say named arguments are verbose!
If your 'proto-code' expects the xci symbol to be some sort of serially numbered variable, it is not. You would have to use %local xc&i; %let xc&i= for assignment, and &&xc&i for resolution. Also, your original code references &from which is not passed.
Building is fun. I would also recommend surveying past conference papers and SAS literature for similar works that may already meet your goal.
You could start with a space-separated list of column names and avoid looping entirely:
/*Define list of columns*/
%let COLS = A B C;
%put COLS = &COLS;
/*Add table alias prefix*/
%let REGEX = %sysfunc(prxparse(s/(\S+)/t1.$1/));
%let COLS = %sysfunc(prxchange(®EX,-1,&COLS));
%put COLS = &COLS;
%syscall prxfree(REGEX);
/*Condense multiple spaces to a single space*/
%let COLS = %sysfunc(compbl(&COLS));
%put COLS = &COLS;
/*Replace spaces with commas*/
%let COLS = %sysfunc(translate(&COLS,%str(,),%str( )));
%put COLS = &COLS;
In the end as #Tom noted, SAS dataset options are more convenient, and using them one doesn't need to loop over variables.
Here is the macro I came with :
*--------------------------------------------------------------------------------------------- ;
* JOIN ;
* Performs any join (defaults to inner join). ;
* By default left table is overwritten (convenient for successive left joins) ;
* Performs a natural join so columns should be renamed accordingly through 'rename' parameters ;
*----------------------------------------------------------------------------------------------;
%macro join
(data1= /* left table */
,data2= /* right table */
,keep1= /* columns to keep (default: keep all), don't use with drop */
,keep2=
,drop1= /* columns to drop (default: none), don't use with keep */
,drop2=
,rename1= /* rename statement, such as 'old1 = new1 old2 = new2 */
,rename2=
,j=ij /* join type, either ij lj or rj */
,out= /* created table, by default data1 (left table is overwritten)*/
);
%if not %length(&out) %then %let out = &data1;
%if %length(&keep1) %then %let keep1 = keep=&keep1;
%if %length(&keep2) %then %let keep2 = keep=&keep2;
%if %length(&drop1) %then %let drop1 = drop=&drop1;
%if %length(&drop2) %then %let drop2 = drop=&drop2;
%if %length(&rename1) %then %let rename1 = rename=(&rename1);
%if %length(&rename2) %then %let rename2 = rename=(&rename2);
%let kdr1 =;
%let kdr2 =;
%if (%length(&keep1) | %length(&drop1) | %length(&rename1)) %then %let kdr1 = (&keep1&drop1 &rename1);
%if (%length(&keep2) | %length(&drop2) | %length(&rename2)) %then %let kdr2 = (&keep2&drop2 &rename2);
%if &j=lj %then %let j = LEFT JOIN;
%if &j=ij %then %let j = INNER JOIN;
%if &j=rj %then %let j = RIGHT JOIN;
proc sql;
create table &out as select *
from &data1&kdr1 t1 natural &j &data2&kdr2 t2;
quit;
%mend;
Reproducible Examples:
data temp1;
input letter $ number1 $;
datalines;
a 1
a 2
a 3
b 4
c 8
;
data temp2;
input letter $ letter2 $ number2 $;
datalines;
a c 666
b d 0
;
* left join on common columns into new table temp3;
%join(data1=temp1,data2=temp2,j=lj,out=temp3)
* inner join by default, overwriting temp 1, after renaming to join on another column;
%join(data1=temp1,data2=temp2,drop2=letter,rename2= letter2=letter)
I want to find the most optimal model specification for a Logit Regression with a dependent variable that is multinomial distributed.
Y has three outcomes, and I want to make a forecasting model with 2 variables - a lagged and differenced spot rate Time-series and a time-series of the estimated realized Volatility.
My initial thought was that I create a loop that goes through each specification, and outputs the AIC value, then I can backtrack and find the most optimal model.
This is working, but there's a hitch. I want to look at the spot rate in the following way (example):
Spot_t - Spot_t-n (n could be 21).
This opens up for a whole lot specifications. In my trial regression I included 12 variables of each, each lagged by 21 days * number of variable. This gave a good model, but I think I need a better iterative process.
If i limit my model to include 12 variables/lags of each variable, we are talking 24 loops. Within these loops there will be many of the same iterations, which is time-consuming and silly in my opinion. Maybe there is a way to bypass this issue.
I am not used to code in SAS. I have decent experience in VBA.
My code is cropped in below, and if you have any idea how to do this differently I would really appreciate it!
Maybe it's possible to do with arrays or something like that - but I am not used to SAS programming, so maybe you could shed some light on how to do all this :)
%macro Selectvariables;
%let y = 0;
%let z = 2;
%do a = 1 %to &z;
%do b = 1 %to &z;
%do c = 1 %to &z;
%do d = 1 %to &z;
%do e = 1 %to &z;
%do f = 1 %to &z;
%do g = 1 %to &z;
%do h = 1 %to &z;
%do i = 1 %to &z;
%do j = 1 %to &z;
%do k = 1 %to &z;
%do l = 1 %to &z;
%do m = 1 %to &z;
%do n = 1 %to &z;
%do o = 1 %to &z;
%do p = 1 %to &z;
%do q = 1 %to &z;
%do r = 1 %to &z;
%do s = 1 %to &z;
%do t = 1 %to &z;
%do u = 1 %to &z;
%do v = 1 %to &z;
%do w = 1 %to &z;
%do x = 1 %to &z;
%let First_Spot_var = Spotlag_&a;
%let Second_Spot_var = Spotlag_&b;
%let Third_Spot_var = Spotlag_&c;
%let Fourth_Spot_var = Spotlag_&d;
%let Fifth_Spot_var = Spotlag_&e;
%let Sixth_Spot_var = Spotlag_&f;
%let Seventh_Spot_var = Spotlag_&g;
%let Eighth_Spot_var = Spotlag_&h;
%let Nine_Spot_var = Spotlag_&i;
%let Tenth_Spot_var = Spotlag_&j;
%let Eleventh_Spot_var = Spotlag_&k;
%let Twelveth_Spot_var = Spotlag_&l;
%let First_vol_var = vollag_&m;
%let Second_vol_var = vollag_&n;
%let Third_vol_var = vollag_&o;
%let Fourth_vol_var = vollag_&p;
%let Fifth_vol_var = vollag_&q;
%let Sixth_vol_var = vollag_&r;
%let Seventh_vol_var = vollag_&s;
%let Eighth_vol_var = vollag_&t;
%let Nine_vol_var = vollag_&u;
%let Tenth_vol_var = vollag_&v;
%let Eleventh_vol_var = vollag_&w;
%let Twelveth_vol_var = vollag_&x;
%let Name = Model_&y;
proc Logistic data=CurrencyData;
&Name.: model Y1_Optimal_Strategy_3M = &First_Spot_var &Second_Spot_var &Third_Spot_var &Fourth_Spot_var &Fifth_Spot_var &Sixth_Spot_var &Seventh_Spot_var &Eighth_Spot_var &Nine_Spot_var &Tenth_Spot_var &Eleventh_Spot_var &Twelveth_Spot_var &First_vol_var &Second_vol_var &Third_vol_var &Fourth_vol_var &Fifth_vol_var &Sixth_vol_var &Seventh_vol_var &Eighth_vol_var &Nine_vol_var &Tenth_vol_var &Eleventh_vol_var &Twelveth_vol_var;
ods output FitStatistics=AIC_&Name(where=(criterion="AIC"));
run;
%let y = %Eval(&y+1);
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
%end;
data AllAIC;
set AIC_: INDSNAME=modelVars;
dsname = scan(modelVars, 2);
run;
proc sort data=AllAIC out=allAIC_Sorted;
by InterceptAndCovariates;
run;
proc Print; run;
%mend;
Sorry for the crazy wide code. Hope you can help me. Maybe i am overcomplicating the issue. :)
Thanks a lot.
Best regards,
Christian
EDIT: I have set z = 2 just for testing purposes. Ideally this would be considerably higher.
I'm not sure there is a BEST way to do this. This a problems statisticians have come up against for a long time.
You should look through the automated variable selection algorithms available in PROC LOGISTIC.
https://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/viewer.htm#statug_logistic_syntax22.htm
If you have it installed and have a multi-core machine with enough RAM, PROC HPLOGISTIC will probably do the selection faster.
https://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/viewer.htm#statug_hplogistic_toc.htm
I recommend looking at Cross Validated (StackExchange for Statistics) to research the pros and cons of each selection method.
https://stats.stackexchange.com/
I used to use a %do ... %to and it worked fine , but I when I tried to list all character values without %to I got a message ERROR: Expected %TO not found in %DO statement
%macro printDB2 ;
%let thisName = ;
%do &thisName = 'Test1' , 'Test2' , 'Test3' ;
proc print data=&thisName ;
run ;
%end ;
%mend printDB2 ;
I know how to complete this task using %to or %while . But I am curious is it possible to list all character values in the %do ? How can I %do this ?
If your goal here is to loop through a series of character values in some macro logic, one approach you could take is to define corresponding sequentially named macro variables and loop through those, e.g.
%let mvar1 = A;
%let mvar2 = B;
%let mvar3 = C;
%macro example;
%do i = 1 %to 3;
%put mvar&i = &&mvar&i;
%end;
%mend example;
%example;
Alternatively, you could scan a list of values repeatedly and redefine the same macro var multiple times within your loop:
%let list_of_values = A B C;
%macro example2;
%do i = 1 %to 3;
%let mvar = %scan(&list_of_values, &i, %str( ));
%put mvar = &mvar;
%end;
%mend example2;
%example2;
I've explicitly specified that I want to use space as the only list delimiter for scan - otherwise SAS uses lots default delimiters.